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Table  2-1. 

GMC  Release  Dependent  Iferametera 
(Part  1  of  11) 

Element 

Program 

LIKE  # 

Variable 

Explanation 

1 

GMF. TOP 

100 

SYS64 

Used  to  control  conditional  assembly 
of  GMC  set-1  for  V6.4(2H)  release 
set-0  for  W7.2(4J)  release 

2 

10240-10740 

Code  in  this  area  searches  for  trace 
processing  within  the  dispatcher. 

Trace  code  must  be  within  500  octal 
locations  of  the  address  specified 
by  entry  point  15  decimal  of  the 
dispatcher.  This  entry  point  should 
contain  the  address  of  location 
TRACE  within  the  dispatcher,  which 
is  where  the  trace  processing  code 
is  located.  The  code  being  searched 
for  is  an  LDAQ;STAQ;TRA  0,1.  If 
these  instructions  are  not  found, 

CMP  will  abort  with  an  NF  abort.  In 
order  for  CMF  to  operate,  the  $TRACE 
card  in  the  boot  deck  must  not 
specify  no  tracing.  At  a  minimum, 
the  $TRACE  card  must  request  at 
least  a  single  trace  to  be 
captured.  It  is  recommended  that  if 
a  site  wants  to  minimize  trace  logic 
overhead,  but  at  the  same  time  be 
able  to  run  GMF  without  altering  the 
boot  deck,  that  the  following  trace 
Cfi  pd  1)0  U96d « 

$  TRACE  777777777777 , 645777777777 

3  10760-11180  -  Code  in  this  area  is  used  to  make  a 

correction  to  accounting  processing, 
if  the  correction  has  not  already 
been  made  via  patches.  The  code  is 
searched  for  within  500  octal 
locations  of  .MIOS  entry  point.  The 
code  searched  for  is  SBLA 
TRREG+7,$;ARL  12;  ADLA  .CRT0D,7. 

The  ARL  is  changed  to  an  ARS. 

4  MUM.INIT  500-740  -  Code  in  this  area  searches  the 

dispatcher  for  generation  of  trace 
types  10  and  11.  The  code  searches 
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Table  2-1.  (Part  2  of  ll) 


Element 


5 


Program  LINE  #  Variable  Explanation 

for  an  ORA  “OIO.DL  instruction  and 
an  XED  .CRTRC  instruction  3 
locations  in  front  of  the  ORA 
instruction.  If  the  XED  is  not 
found,  then  a  NOP  instruction  is 
looked  for  and  will  be  replaced  with 
an  XED  instruction.  If  neither  the 
XED  or  NOP  is  found,  the  GMF  will 
abort  with  a  ZO  abort.  The  same 
logic  is  repeated,  except  this  time 
an  ORA  *011,DL  is  searched  for. 

These  instructions  must  be  located 
within  1500  octal  locations  of  entry 
point  15  decimal  of  .MDISP.  This 
entry  point  should  be  the  address  of 
location  TRACE  within  the  dispatcher. 

Code  in  these  areas  searches  .MFALT 
for  generation  of  trace  type  15*  The 
code  searches  for  an  ADLA  “015, DL 
instruction  and  an  XEC  .CRTRC+2 
instruction  6  locations  in  front  of 
the  ADLA  instruction.  If  the  XEC 
instruction  is  not  found,  then  a  NOP 
instruction  is  looked  for  and  will 
be  replaced  with  an  XEC  instruction* 
If  neither  the  NOP  or  XEC  iB  found, 
the  CMP  will  abort  with  a  Z4  abort* 
These  instructions  must  be  located 
within  1000  octal  words  of  the  entry 
point  of  .MPALT. 

Code  in  these  areas  searches  .MIOS 
for  generation  of  trace  type  7.  The 
code  searches  for  an  ORA  7,DL 
instruction  and  an  XED  .CRTRC 
instruction  7  locations  in  front  of 
the  ORA  instruction.  If  the  XED 
instruction  is  not  found,  then  a  NOP 
instruction  is  looked  for  and  will 
be  replaced  with  an  XED  instruction. 
If  neither  the  NOP  or  XED  is  found, 
the  GMP  will  abort  with  a  Z1  abort. 
These  instructions 


MSM.INIT  400-600 
CM.INIT  380-570 
CAM.INIT  220-420 


MSM.INIT  610-820 
CM.INIT  580-790 
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Thble  2-1.  (ftirt  3  of  ll) 


Element  Program  LIKE  #  Variable  Erplanation 

must  be  located  within  7000  octal 
locations  of  the  entry  point  of 
.MIOS. 

7  CM.INIT  1690-1890  -  Code  in  this  area  searches  .MIOS  for 

generation  of  trace  type  4*  The 
code  searches  for  an  ABL  18 
instruction  and  an  XED  instruction  3 
locations  in  front  of  the  ABL 
instruction.  If  the  XED  instruction 
is  not  found,  then  a  NOP  instruction 
is  looked  for  and  will  be  replaced 
with  an  XED  instruction.  If  neither 
the  NOP  or  XED  is  found,  the  OIF 
will  abort  with  a  Z3  abort.  TOiese 
instructions  must  be  located  within 
1000  octal  locations  of  the  entry 
point  of  .MIOS. 

8  CM.INIT  1900-2100  -  Code  in  this  area  searches  .MIOS  for 

generation  of  trace  type  22.  The 
code  searches  for  an  ABL  12 
instruction  and  an  XED  instruction  3 
locations  in  front  of  the  ABL 
instruction.  If  the  XED  instruction 
is  not  found,  then  a  NOP  instruction 
is  looked  for  and  will  be  replaced 
with  an  XED  instruction.  If  neither 
the  NOP  or  XED  is  found,  the  GMF 
will  abort  with  a  Z2  abort.  These 
instructions  must  be  located  within 
6000  octal  locations  of  the  entry 
point  of  .MIOS. 

9  CAM.INIT  1110-1520  -  Code  in  this  area  searches  .MDNET  to 

locate  the  CMBXO  table.  The  code 
searches  for  a  BCI  1,C  instruction 
within  5000  decimal  locations  of  the 
entry  point  of  .MDNET.  If  CM? 
cannot  locate  this  instruction,  GMF 
will  abort  with  an  NS  abort.  If 
this  instruction  is  found,  then  a 
BCI  1,TSS  instruction  is  searched 
for  within  30  decimal  locations  of 


I 
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Table  2-1.  (ifcrt  4  of  ll) 


Element  Program 


Variabla  Explanation 


ILL.IHIT  300-1620 


tha  pravioua  instruction.  If  the 
aacond  inatruction  is  not  found,  CMP 
will  abort  with  an  NT  abort.  If 
this  instruction  is  found,  than  an 
LDA  T.55,2  instruction 
(000001235212)  is  searched  for 
within  10  deciaal  locations  of  the 
previous  inatruction.  If  this 
instruction  is  not  found,  GMF  will 
abort  with  an  MU  abort. 

Code  in  this  area  searches  for  the 
generation  of  a  series  of  trace 
types  using  the  same  logical 
construct  as  previously  described 
for  the  other  INIT  sections, 
following  is  a  list  of  instructions 
searched  for: 

(1)  LDX2  .CRTRC+3  and  an  XEC 
.CRTRC+2  3  instructions  in  front 

(2)  LDX2  .CRTRC+3  and  an  XEC 
•CRTRC+2  3  instructions  in  front 

(3)  ARL  12  and  an  XED  . CRTRC  3 
instructions  in  front 

(4)  ORA  -016, DL  and  an  XEC  .CRTRC+2 

3  instructions  in  front 

(5)  ORA  -037, DL  and  an  XED  .CRTRC  4 
instructions  in  front 

(6)  ORQ  -013, DL  and  an  XEC  .CRTRC  7 
instructions  in  front 

(7)  ORA  *021, DL  and  an  XEC  .CRTRC+2 

4  instructions  in  front 
These  areas  of  code  search  for 
generation  of  trace  types 
0,3,22,16,37,13  and  21, 
respectively,  and  if  appropriate 
code  is  not  found,  the  following 
respective  aborts  will  occur: 

Z6, Z6, Z7, Z8, Z9, ZA  and  ZB. 

The  modules  which  are  searched  are 
listed  below: 

(1)  1000  octal  locations  from  the 
entry  point  of  .MIOS 

(2)  1000  octal  locations  from  the 


9* 
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Table  2-1.  (Part  5  of  ll) 


Element  Program  LINE  #  Variable  Explanation 

entry  point  of  .MIOS 

(3)  6000  octal  locations  from  the 
entry  point  of  .MIOS 

(4)  1000  octal  locations  from  the 
entry  point  of  .MFALT 

(5) -(7)  6000  octal  locations  from 

the  entry  point  of  .MDISP 


11  CPU. PAT  300-440  -  Code  in  this  area  searches  for  an 

ASA  .SALT, 3  instruction  in  the 
dispatcher.  Code  must  be  within  300 
octal  locations  after  the  address 
specified  by  entry  point  20  decimal 
of  the  dispatcher.  This  entry  point 
should  contain  the  address  of 
location  DWAIT  within  the 
dispatcher.  The  trace  code  to  be 
inserted  by  GMP  indicates  that  a  job 
is  being  taken  out  of  processing. 

If  the  ASA  instruction  is  not  found, 
an  N1  abort  will  occur. 

12  460-590  -  In  a  WW7.2  (4JS)  system,  code  in 

this  area  searches  for  an  STQ 
,QT0D,4  instruction.  Search  area  is 
the  same  as  described  for  line  300. 
This  trace  indicates  that 
subdispatching  has  finished  using 
the  processor.  If  the  STQ 
instruction  is  not  found,  an  N8 
abort  will  occur. 


13  640-800  -  If  the  dispatcher  queue  option  of 

the  CPU  Monitor  is  activated,  code 
in  this  area  searches  for  an  ORSA 
•STATE, 4  instruction  followed  by  an 
LDA  .STATE, 4  instruction.  Code  must 
be  within  100  octal  locations  after 
the  address  specified  by  entry  point 
7  of  the  dispatcher.  This  entry 
*>oint  should  contain  the  address  of 
location  DSPQH  within  the 
dispatcher.  This  trace  indicates 
that  a  job  is  being  placed  into  the 
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Table  2-1.  (ftirt  6  of  ll) 


LINE  #  Variable  Explanation 

dispatcher  queue.  If  these 
instructions  are  not  found,  an  NA 
abort  will  occur. 

810-970  -  If  the  dispatcher  queue  option  of 

the  CPU  Monitor  is  activated,  code 
in  this  area  searches  for  a 
LCQ-010001.DL  instruction  followed 
by  an  ANQ  .STATE, 5  instruction. 

These  instructions  must  be  within 
1500  octal  locations  after  the 
address  specified  by  entiy  point  1 
of  the  dispatcher.  This  entry  point 
should  contain  the  address  of 
location  DSP  within  the  dispatcher. 
This  trace  indicates  that  a  job  is 
being  taken  out  of  the  dispatcher 
queue  and  placed  into  execution.  If 
these  instructions  are  not  found,  an 
NB  abort  will  occur. 

970-1310  -  In  order  to  implant  its  special 

hooks,  the  CPU  Monitor  must  modify 
the  dispatcher  and,  therefore,  it 
requires  eight  words  of  patch 
space.  If  the  dispatcher  queue 
option  of  the  CPU  Monitor  is 
activated,  then  16  words,  instead  of 
the  normal  8,  are  required.  This 
patch  area  must  be  within  200  octal 
locations  in  front  of  the  address 
specified  by  entry  point  15  decimal 
of  the  dispatcher.  This  entry  point 
should  contain  the  address  of 
location  TRACF  within  the  dispatcher. 

2190-2320  -  If  sufficient  patch  space  was  not 

available  in  the  standard  patch 
area,  the  CPU  Monitor  will  attempt 
to  locate  patch  space  in  a  specially 
defined  user  patch  area.  This 
search  will  take  place  only  if  bit  2 
of  word  0  within  the  dispatcher  is 
set.  This  patch  space  should  be 
within  200  octal  locations  after  the 


2-16 


A 


i 


CH-7 


I 


Table  2-1.  (Pfcrt  7  of  11) 


Element  Program  LIME  #  Variable  Explanation 

address  ILIST  within  the 
dispatcher.  The  address  of  ILIST  is 
found  in  word  7  of  the  dispatcher. 

If  sufficent  patch  space  is  not 
found,  an  M2  abort  will  occur. 

17  1950-2360  -  Code  in  this  area  searches  .MPALT 

for  GATE  LOOP  timing  logic  and 
attempts  to  correct  logic  errors 
that  exist  within  the  code.  The 
code  searches  for  an  ARL  12  and 
ASA.CR0VK,7  instruction  within  2000 
octal  locations  of  entry  point  13  of 
.MPALT.  This  entry  point  should  be 
the  address  of  the  location  BOOT 
within  .MPALT.  Twelve  instructions 
in  front  of  the  ARL  instruction,  a 
LDX6.CRPRG, 7  instruction  is  searched 
for.  If  this  instruction  is  found, 
the  code  will  next  obtain  the 
address  stored  in  the  upper  half  of 
the  word  located  two  locations  in 
front  of  the  entry  point  of  .MPALT. 
This  should  be  the  address  of  the 
MMECNT  table.  The  code  will  add  46 
decimal  to  this  address  and  Bhould 
now  contain  the  address  of  a  patch 
table.  The  code  will  search  for 
nine  free  words,  and  if  found  will 
correct  the  logic  error.  If  patch 
space  is  not  found,  the  code  will 
remain  unaltered. 


18  CAM. PAT  170-290  -  Code  in  this  area  searches  for  an  LDQ 

19  M.LID,3  instruction,  followed  by  an 

AMQ-0077777,DU  instruction  in  module 
DMVW/DMET.  The  search  for  this  code 
begins  at  the  address  specified  by 
entry  point  -8  of  DMWW/DMET  and 
continues  until  the  code  is 
located.  This  entry  point  should 
contain  the  address  of  location 
MRQVT-DMET  within  DMVW/DMET.  If  the 
instructions  are  not  found,  an  M3 
abort  will  occur. 
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Table  2-1.  (Part  8  of  11) 


Element  Program  LINE  #  Variable  Explanation 

20  350-500  -  In  order  to  Implant  its  special 

and  hooks,  the  CAM  must  find  8  words  of 

690-820  patch  space.  Even  though  the  CAM  is 

modifying  module  DNET,  it  will  use 
the  patch  space  available  in  the 
•MDISP  module.  This  search  for 
space  is  identical  to  that  for 
CPU. PAT  at  lines  970-1310  and  lines 
2190-2320.  If  patch  space  is  not 
found,  an  N4  abort  will  occur. 

21  MSM.PAT  200-240  -  Code  in  this  area  searches  the 

dispatcher  for  SSA  cache  code.  If 
bit  4  of  word  0  in  the  dispatcher  is 
set,  then  cache  is  available.  If 
this  bit  is  not  set,  then  no  further 
searches  are  performed. 

22  250-380  -  Code  in  this  area  searches  the 

dispatcher  for  the  location  of 
DBASE.  The  address  at  entry  point 
-2  is  obtained.  This  address  points 
to  the  location  ILIST  in  the 
dispatcher.  At  location  ILIST,  a 
series  of  addresses  are  stored  and 
MSM.PAT  searches  this  list  for  the 
address  DBASE.  If  DBASE  is  not 
found,  an  N5  abort  will  occur. 


23  410-620  -  Code  in  this  area  searches  for  an 

AOS  .CRTDL  and  an  AOS  . CRTBH 
instruction.  This  code  needs  to  be 
within  300  octal  locations  after  the 
address  DBASE  within  the 
dispatcher.  If  this  code  is  not 
found,  an  N5  or  N6  abort  will  occur. 

In  order  to  implant  its  special 
hooks,  the  MSM  Monitor  must  modify 
the  dispatcher  and,  therefore,  it 
requires  8  words  of  patch  space. 

This  search  is  identical  to  that  for 
CPU. PAT  at  line  680  and  lines 
970-1310  and  2190-2310.  If  patch 
space  is  not  found,  an  N7  abort  will 
occur. 


24 


670-790 

and 

1120-1250 


2-18 


CH-7 


« 


< 

9 


I 


Thble  2-1.  (Bart  9  of  ll) 


Element 

Program 

LINE  # 

Variable 

Explanation 

25 

OIF.  MON 

1040 

MSI 

Offset  from  entry  point  of  .MFSIO 
which  points  to  the  word  giving  the 
absolute  address  of  FKS  catalog 
cache  buffer.  Used  only  in  V7.2. 

Set  to  -13  decimal. 

26 

1050 

MS2 

Offset  from  entry  point  of  .MFSIO 
pointing  to  the  word  which  gives  the 
option  selection  for  MS  catalog 
cache.  Used  only  in  V7>2.  Set  to 
-15  decimal. 

27 

MUM.T10 

220 

SYS64 

See  GMF.TOP 

28 

920 

FIFO 

Address  of  the  FIFO  buffer  within 
PALC.  It  is  used  to  search  the  JCT 
table  of  PALC.  Tl iis  includes  adding 
in  a  110  octal  offset  for  the 
loading  of  PALC  in  ¥6.4-  Ihere  is 
no  PALC  offset  in  W7.2. 

29 

5410 

XPQ24 

Location  in  CALC  of  the  memory 
demand  table.  Set  to  octal  111. 

30 

5420 

SLVSNB 

Offset  in  slave  prefix  area  of  job 
SNUMB.  Set  to  octal  36- 

31 

5430 

MEMUSE 

Offset  in  slave  prefix  area  of 
loader  memory  use  word.  Set  to 
octal  37» 

32 

5440 

IDENT 

Offset  in  slave  prefix  area  of  job 
IDENT.  Set  to  octal  66. 

33 

CM.T07A 

190 

IDENT 

Offset  in  slave  prefix  area  of  job 
ident.  Set  to  octal  66. 

34 

210 

SYS64 

See  GMF.TOP 

35 

9960-10130 

Code  in  this  area  searches  .MFSIO  in 
order  to  gather  statistics  for  MS 
catalog  cache  analysis.  All  the 
following  references  are  offsets 
from  the  entry  point  of*  .MFSIO: 

Table  2-1.  (Bart  10  of  ll) 


Element  Program  LINE  # 


Variable  Explanation 


36 


37 


38 


-12 

#  of 

cache  hits 

-11 

#  of 

writes 

-10 

#  of 

reads 

841 

#  of 

reads  not  in  CC 

842 

#  of 

320-word  reads 

843 

#  of 

skips 

844 

#  Of 

cache  clears 

848 

#  Of 

no  hits 

849 

#  Of 

hits 

10180-10250  -  Code  in  this  area  eearchee  .MAS04  in 

order  to  gather  statistics 
concerning  the  available  space  table 
utilization.  All  the  following 
references  are  offsets  from  the 
entry  point  of  .MAS04: 

-6  #  of  times  buffer  allocation 

attempted 

-5  #  of  times  buffer  busy 
-4  #  of  times  AST  was  in  memory 
-3  #  of  times  AST  in  memory  but 
busy 

11440  FFCCC  Address  in  PALC  where  the  file  code 

is  stored  during  GEFSYE  processing. 
In  PALC,  variable  1b  called  FFC. 

Set  to  6177  octal  in  ¥6.4,  13143 
octal  in  ¥7.2.2,  and  13201  octal  in 
¥7.2.3.  This  includes  110  octal  for 
loading  of  PALC  in  V6.4*  There  is 
no  offset  for  PALC  in  W7.2. 

11450  SNUMBP  Address  in  PALC  where  the  SNUMB  is 

stored  during  GEFSYE  processing.  In 
PALC,  variable  is  called  SNUMB.  Set 
to  35012  octal  in  ¥6.4,  2632  octal 
in  ¥7.2.2,  and  2642  octal  in 
¥7.2.3*  This  includes  110  octal  for 
loading  of  PALC  in  ¥6.4*  There  is 
no  offset  for  PALC  in  ¥¥7.2. 
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Title  2-1.  (Pirt  11  of  11) 


Element  Program  LINE  #  Variable  Explanation 

39  11460  ACT  Address  in  PALC  where  the  activity 

number  Is  stored  during  GEPSYE 
processing*  In  PALC,  variable  is 
called  SACT.  Set  to  33231  octal  in 
¥6*4,  1051  octal  in  ¥7.2.2,  and  1061 
octal  in  ¥7.2.3*  This  includes  110 
octal  for  loading  of  PALC  in  ¥6*4* 
There  is  no  offset  for  PALC  In  ¥7*2. 


SECTION  4.  RESOURCE  MONITOR  BATA  REBUCTION 

The  Resource  Monitor  Bata  Reduction  (RMBR)  la  composed  of  three  discrete 
programs*  The  first  program  (PSUMR)  processes  the  SC?  records  producing  an 
intermediate  record*  The  second  program  is  a  daily  monitor  (BEXON)  which 
maintains  an  on-line  history  file.  The  third  program  (RPTSUM)  produces 
reports  and  prints  the  required  plots*  ?lgure  4-1  is  an  overview  of  the 
RMBR. 


4*1  PSUMR 


PSUMR  selects  the  collector  data  records  from  an  SC?  Input  file.  The 
records  are  sorted  hy  system  IB,  date,  time,  and  record  subtype*  The 
records  are  then  processed  in  pairs  to  produce  an  intermediate  record. 

4*1*1  PSUMR  Inputs.  PSUMR  requires  one  input  file  containing  SC?  records. 
Two  optional  inputs  are  the  intermediate  record  -  old  master,  and  a 
parameter  file. 

'4*1* 1*1  SC?  Records  (Pile  IN).  Contains  resource  monitor  collector  (RMC) 
SC?  records* 

4*1*1* 2  Intermediate  Record  -  Old  Master  (?ile  OM).  Hiis  optional  input 
will  be  copied  to  the  output  file  before  new  records  are  added  from  the 
current  Jobs  processing. 

4*1*1. 3  Parameter  file  (Pile  P?)«  PSUMR  will  process  the  following 
parameter  language  statements: 

CONPIG 

SUMMARY 

SNUMB 

The  format  and  use  of  the  parameter  language  is  found  in  section  4*4* 

4*1.2  PSUMR  Output.  Hires  outputs  are  available  from  PSUMR:  (l)  listing 
of  the  parameter  file  if  present;  (2)  intermediate  record  -  new  master  file; 
(3)  intermediate  record  -  current  file* 

4*1* 2*1  Parameter  Pile  Listing  (Pile  RP).  This  is  a  listing  of  the 
contents  of  file  ??.  Any  of  the  statements  processed  which  were  invalid 
will  be  flagged.  Invalid  contents  will  not  halt  the  Job  execution,  but  may 
affect  how  the  data  is  summarized  in  the  intermediate  record. 

4*1*2. 2  Intermediate  Record  New  -  Master  file  (file  KM).  This  file 
contains  the  contents  of  the  old  master  file,  if  present,  plus  the  records 
added  from  the  current  execution  of  the  program. 
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IH  ■  SC?  accounting  records 
•  optional  parameter  flla 

■  old  master  lntezmedlate  record 

■  aev  master  lntezmedlate  record 
-  optional  lntezmedlate  output  for  SEXOV 

H?  -  daily  monitor  hlstozy  file 


figure  4-1 •  HMDS  Overview 


<1 

i 


4. 1.2. 3  Intermediate  Record  -  Current  File  (File  DR).  This  file  contains 
the  intermediate  records  created  during  the  current  execution  of  the 
program.  This  file  is  used  to  pass  the  intermediate  records  to  the  daily 
monitor  (DEMON),  when  present,  in  a  subsequent  activity. 


4.1.3  PSUMR  Deck  Setup.  The  following  control  cards  are  required  to 
execute  PSUMR: 


i 

i 

$ 

$ 

$ 

t 

* 

* 

$ 

$ 


IDENT 

ACCOUNTING 

INFORMATION 

USERID 

USERID$PASSVORD/SCC 

PROGRAM 

PSUMR, Dump 

LIMITS 

20, 2CK,  ,5K 

PRMFL 

**,R,R,B2 9IDPX0/NEWRM0N/DDYN 

PRMFL 

H*,R,R, B29IDPX0/NENRM0N/DDYN 

DATA 

PF 

optional  parameters 

TAPE 

OM 

optional  old  master  intermediate 
record 

TAPE 

NM 

new  master  intermediate  record 

FILE 

DR 

current  file  output 

SYSOUT 

RP 

parameter  file  listing 

TAPE 

IN 

SCF  input  file 

FILE 

S1,S1R,50R 

sort  files 

FILE 

S2,S2R,5QR 

FILE 

S3,S3R,50R 

FILE 

END  JOB 

S4,S4R,5QR 

The  sort  files'  size  should  be  increased  or  decreased  depending  upon  the 
input  volume.  Tape  files  may  be  replaced  by  permanent  or  temporary  disk 
files.  File  DR  may  be  null  if  the  daily  monitor  does  not  follow  in  a 
subsequent  activity. 

4.2  DEMON 


The  daily  monitor  accepts  the  intermediate  records  created  by  PSUMR.  A 
history  file  will  be  initialized  (if  current  null)  or  updated  from  the  data 
in  the  intermediate  record.  A  matrix  structure  is  built  to  summarize  and 
store  the  data.  The  matrix  contains  96  entries  corresponding  to  the  time 
period  to  be  summed  together.  The  entry  size  is  variable  depending  upon  the 
graphs  to  be  produced.  For  DEMON,  the  default  is  all  graphs.  The  history 
file  contains  a  copy  of  this  matrix  structure.  For  an  update  run,  DEMON 
copies  the  history  file  into  core.  A  second  matrix  is  built  for  the  current 
day's  data.  Both  matrices  are  then  updated  from  the  input.  The  history 
matrix  is  written  to  disk.  Finally,  the  current  input  matrix  is  compared  to 
the  history  matrix.  If  25?  of  all  data  items  are  not  within  25?  of  the 
history  matrix  value,  the  program  generates  a  complete  set  of  graphs  to 
indicate  a  significant  change  from  the  history  file  value. 

4.2.1  DEMON  Inputs.  DEMON  has  one  required  input  and  two  optional  inputs. 


i 
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4. 2. 1.1  Hiatory  Fils  (Kl»  HF).  At  history  file  la  required  for  DBtON 
execution.  The  history  file  ia  a  random  file  which  containa  an  image  of  the 
in-core  matrix  uaed  for  summarising  and  comparing  data. 

4*2.1. 2  Parameter  file  (file  FT).  The  parameter  file  ia  optional. 

However,  if  PSUMR  waa  supplied  with  a  SUMMARY  atateaent,  the  DEMOB  ahould  be 
aupplled  the  aaae  summary  statement  for  desired  results.  DEMOB  processes 
only  a  summary  statement.  The  format  and  use  of  the  parameter  language 
statements  is  found  in  aection  4.4. 

4. 2.1. 3  Intermediate  Records  (file  IB).  This  ia  an  optional  input.  If 
present,  the  intermediate  record  data  will  be  summarized,  as  described 
above,  according  to  parameter  setting  marked  on  the  Intermediate  record. 

The  hiatory  file  will  be  initialised  or  updated  as  it  is  appropriate.  If 
file  IB  la  not  present,  the  hiatory  fils  data  will  be  uaed  to  generate  a  set 
of  graphs.  This  feature  allows  the  user  to  get  a  current  historical 
"picture"  at  any  time. 

4.2.2  DEMOB  Outputs.  DEMOB  outputs  are  the  updated  history  file  and  a 
listing  file. 

4* 2. 2.1  Updated  History  File  (Bile  HF).  Befer  to  section  4.2.1. 3* 

4. 2. 2. 2  Listing  Bile  (Pile  HP).  The  output  listing  will  contain  a  listing 
of  the  parameter  file,  if  any,  and  any  generated  graphs.  The  graphs 
produced  are  listed  and  described  in  section  4.5.. 


4.2.3  DEMON  Deck  Setup.  The  following  control  cards  are  required  to 
execute  DEMON. 


$ 

i 

* 

* 

* 


IDENT 

ACCOUNTING  INFORMATION 

USERID 

USER  ID$PAS SWORD 

PROGRAM 

DEMON, DUMP 

LIMITS 

10.24K, ,10K 

FRMFL 

** ,  R ,  R ,  B2  9IDPX0/NEWRM0N/DDTN 

DATA 

PF  optional  parameter  file 

TAPE 

IN  optional  intermediate  file 

FRMFL 

HF,V,R,  (history  cat/flle) 

SYSOUT 

ENDJOB 

RP 

File  IN  may  be  a  permanent  or  temporary  disk  file.  File  KF  is  24  random 
LLINKS. 


4.3  KPTSUM 


PSUMR  has  the  ability  to  maintain  a  master  file  of  intermediate  records. 
RPTSUM  will  process  this  file  to  produce  any  number  of  reports.  Each  report 
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5*2.8  Idle  Monitor,  the  Idle  Monitor  (IDLBO  1b  used  to  collect 
data  concerning  CPU  activity.  This  monitor  can  only  be  used  in 
conjunction  with  the  MUM  or  the  CM.  It  should  not  be  activated  if  one 
of  the  two  aforementioned  monitors  is  not  active.  If  the  Idle  Monitor 
In  present  on  the  R*  file  and  active  and  if,  in  addition,  the  MUM  or 
On  is  not  active,  then  the  IDLBI  will  automatically  be  turned  off. 

The  user  should  read  subsections  5.2.1  and  5*2.5  for  information 
concerning  the  use  of  the  ISLBf.  A  separate  discussion  of  the  format 
of  the  collected  data  records  is  contained  in  subsection  5*4.9* 

Tide  monitor  generates  an  excessive  number  of  traces  and  significant 
overhead.  It  should  be  activated  only  after  the  user  ensures  that  the 
reports  produced  because  of  its  presence  are  an  absolute  necessity  for 
the  evaluation.  In  most  cases,  this  monitor  should  not  be  required. 

5.2.9  Transaction  Processing  System  Monitor.  The  GXC  Transaction 
Processing  System  Monitor  (TPSM)  is  used  to  collect  data  on  the 
|  performance  of  the  Navy  Transaction  Processing  Executive  (NTPE) 

System.  A  separate  discussion  of  the  format  of  the  TP5M  collected 
data  recorde  is  contained  in  subsection  5.4*10.  The  reports 
contalnli^  data  collected  by  TP5M  are  described  in  section  12. 

tfhen  TPSM  is  active,  the  required  traces  must  be  enabled  in  the 
computer  system  boot  deck  on  the  $  TRACE  card  (see  table  5-1).  This 
is  the  only  CMP  monitor  for  which  the  minimum  $TRACE  card  format 
described  in  subsection  5*1  is  not  sufficient  for  proper  execution*  A 
sample  of  the  reports  and  run  time  procedures  for  the  data  reduction 
|  program  can  be  found  in  the  Transaction  Processing  System 
j  (section  12).  The  TPSM  requires  that  at  least  the  following  segment 
numbers  from  table  5-3  be  used  to  generate  the  CMC  R*  file:  1,  9,  11, 
19.  23,  and  36.  The  complete  process  for  generating  an  R*  file  is 
described  in  subsection  5*6. 

NOTE:  The  TPSM  cannot  be  run  concurrently  with  the  TSSM  and  should 
only  be  used  on  a  WVMCCS  W7.2  system.  It  should  not  be  used  on  a 
commercial  release. 

5« 2.9*1  TPS  Trace  Collection.  The  TPSM  is  unlike  most  other  CMC 
monitors  in  that  monitoring  of  the  Transaction  Processing  System  is 
ror.trolled  via  the  operator  console.  Prior  to  collecting  data,  the 
|  user  must  alter  the  NTPE  (see  subsection  5*2. 9*2)  and  must  also  create 

_ _ .able  CMC  R*  file  (see  subsection  5*6).  Once  these  actions  are 

performed  and  a  CMC  execution  is  started,  the  user  must  still  perform 
--e  additional  action  before  data  collection  can  begin.  The  TPSM  is 
♦"-led  on  or  off  by  the  console  operator  via  the  TP  MESS  command.  The 
operator  must  request  "TP  MESS".  Vhen  the  console  responds  "TP 
IV'TS?",  the  message  "TRACE  ON"  is  entered  to  start  trace  generation, 

( or  "TRACE  OFF"  to  suspend  trace  generation.  This  procedure  can  be 
repeated  as  often  as  desired.  The  TPSM  and  the  TSSM  are  the  only  CMC 
monitors  that  can  be  turned  on  or  off  while  the  CMC  is  physically 
executing. 
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5*2. 9*2  Modifying  the  Transaction  Processing  System.  To  use  the 
TPSM,  the  user  must  alter  the  lavy  Transaction  Processing  System. 

MTPB  is  delivered  ss  s  set  of  alters  to  the  commercial  version  of  T PE 
(these  alters  are  in  "*C"  form  for  use  by  the  SCED  utility  program). 

Bi bedded  in  these  original  alters  are  the  modifications  to  place  the 
CMC  trace  points  in  the  executive  modules  (see  figure  12-21).  All 
sections  of  code  for  the  (MIC  support  are  under  control  of  conditional 
assembly  parameters.  To  enable  the  WC  code,  the  changes  shown  in 
figure  12-22  aust  be  merged  with  the  local  file  or  changes  developed 
during  the  site  customisation  of  NTPE.  The  changes  shown  in  this 
figure  have  the  effect  of  both  correcting  and  enabling  the  GMF/TPQf 
collector. 

5*2.10  Timesharing  Subsystem  Monitor.  The  Timesharing  Subsystem 
Monitor  (TSSMJ  is  used  to  measure  T&S  performance.  Section  15  details 
those  reports  available  from  the  data  collected  by  this  monitor.  This 
monitor  should  be  used  only  on  a  WMCCS  W7.2  system.  If  desired  to 
be  run  on  a  commercial  release,  care  should  be  exercised  to  ensure 
that  all  alters  are  located  correctly  (see  subsection  5.2.10.1). 

The  TSSM  causes  the  trace  to  be  taken  from  many  points  in  TS1;  the 
collector  builds  its  records  which  are  then  passed  to  the  ER  for 
buffering.  An  example  of  the  record  format  appears  in  subsection 
5*4.11.  TSSM  requires  the  following  segment  numbers  from  table  5-3  be 
used  to  generate  the  OfC  R*  file:  1,  10a,  11,  19,  23  and  36a. 
Subsection  5.6  shows  how  to  generate  an  R*  file. 
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9  TSSH  RSFRM+6  LDA  -1,DU 


Enter  build  mode 


Bata:  UST  address,  program  stack  tally 

Use:  Cause  entry  to  be  Bade  in  stack  of  subsystem  names;  remove  entry 
from  stack  when  trace  13  is  received 

10  TSSH  LODPEM+9  TNC  SCERH6  EXEC  primitive 

Bata:  UST  address,  program  stack  tally,  subsystem  name  (ASCII) 

Use:  Set  flag  to  indicate  actual  code  executed  (as  opposed  to 
subsystems  CARD,  FORT,  MDQ,  etc.) 

11  TSSH  SYSFRM+4  LDA  .LBU?,2  SYSTM  primitive 

Data:  UST  address 

Use:  Clear  stack  of  subsystem  names 

12  TSSI  CU1CF+17  EAX1  . LCA1S , 2  Log-off 

Data:  UST  address 

Use:  Terminate  processing  of  session,  release  UST  address 

13  TSSI  STARTP+3  LDA  0,3  Command  received 

Data:  UST  address,  command  text  (ASCII) 

Use:  Record  last  user  build  mode  command  for  session  snapshot 

reports;  end  build  mode;  clear  subsystem  name  stack  if  command 
is  "NONE”  and  does  not  come  from  DRL  CALLSS  or  DRL  T.GOTO 

14  T5SJ  LOGON+l  CANA  -07777, DL  Periodic  check 

Data:  Line  ID  if  new  user  waiting 

Use:  Produce  report  entry  if  time  interval  between  trace  102  and  thiB 
one  too  great 

15  TSSJ  LOCI 1+3  STQ  BYTIME  GENAKE  -  no  users 

Data:  Length  of  GEVAKE  (clock  pulses) 

Use:  Force  clearing  of  tables  in  data  reduction  program;  opens 
interval  closed  by  trace  61 

16  TSSJ  LN100+2  STA  ,LKST,2  Break  or  disconnect 

Data:  UST  address,  GEROUT  status  (binary) 

Use:  Set  flag  for  reconnect  if  status  2  (disconnect);  simulate  input 
complete  if  break  and  if  I/O  complete  trace  has  been  processed 
(user  initiates  request  for  service  with  break  as  in  DJST) 


/ 


17  TSSJ  .SSDSP+7  TNC  TYTSS  GEVAXE  until 

subdispatch  done 

Data:  Length  of  GEVAXE  (clock  pulses) 

Use:  Formatted  dump  or  accounting  for  GEVAXEs;  opens  interval  closed 
by  trace  61 

18  TSSJ  TYTSS  STQ  SLEEP  GEVAXE  with  aubdispatch 

busy 

Data:  Length  of  GEVAXE  (clock  pulses) 

Use:  Formatted  dump  or  accounting  for  GEVAXEs;  opens  interval  closed 
by  trace  61 

19  TSSJ  QSPEC  EAA  SPEC  All  Points  Bulletin  or 

remote  I/O  courtesy 
call 

Data:  UST  address,  flags  (.LFLG2)  for  CRUN/DRUN,  GEROUT  status 
Use:  Completes  terminal  I/O  started  by  trace  100 

20  TSSJ  STGCC+5  CMPA  4,DL  Build  mode  input 

received 

Data:  UST  address,  flags  (.LFLC2),  GEROUT  status 
Use:  Completes  terminal  I/O  started  by  trace  100 

21  TSSJ  DSTATM  LDA  .LI0ST.2  SY**  I/O  complete 

Data:  UST  address 

Use:  Completes  disk  I/O  started  by  trace  24 

22  TSSJ  REC0N2+1  CANA  .FX19.DL  Place  user  in 

reconnect  mode 

Data:  UST  address,  flag  for  data  in  transmission 

Use:  Positive  assurance  that  session  is  being  put  in  hold  (not  given 
by  trace  16) 

59  TSSJ  PPTCC2+2  STZ  .LCC, 2  Disk  I/O  for  tape 

mode  complete 

Data:  UST  address 

Use:  Completes  disk  I/O  started  by  trace  24 

104  TSSK  RETSBS+1  SZN  . LSFLC  DRL  processing  complete 

Data:  UST  address,  A-register  status  if  DRL  T.SYOT  or  DRL  TASK, 

O-register  status  if  DRL  SPAWN  or  DRL  PASFLR,  DRL  number  if 
status  reported 

Use:  End  DRL  processing  state;  report  unusual  status  conditions 


TSS  passes  flag  to  CMC  to  writs  initialisation  records  or  GMC 
writes  initialisation  records  after  it  has  recovered  from  a 
lost  data  condition) 

If  TRACE  OFF,  flag  («>) 

If  bad  console  input,  no  trace 
Use:  Initialise  tables  in  data  reduction  program 

91  TSSM  RETS3X+6  CMP XI  1,DU  Make  subdispatch  entry 

Data:  UST  address 

Use:  Open  interval  closed  by  trace  88 

92  TSSN  GUST1+2  CMPA  .LTSRI,DL  Process  log-on  request 

Data:  Line  ID  (octal  2020  if  deferred),  reject  flags  (.TLFLG  bit  3 2 
or  35) 

Use:  Open  interval  closed  by  trace  1 

93  TSSN  GUST21+15  TRA  GUSTR-1  Reject  user  -  bad 

line  status 

Data:  Line  ID 

Use:  Report  along  with  log-on  rejections  (this  is  an  exotic  line 
condition  trace) 

94  TSSN  GUST2A+4  CMPQ  -030000, DU  Check  for  VIP  as 

terminal  type 

Data:  Terminal  ID,  type,  number  of  VIPs  allowed  (.T760),  number 
logged  on  (.TL760) 

Use:  Explain  trace  96  if  it  occurs 

95  TSSN  GUST4+4  CMPQ  GUSTT  Check  UST  wait  time 

Data:  Line  ID,  flag  if  wait  greater  than  16  seconds 
Use:  If  flag  set,  report  as  log-on  reject 

96  TSSN  GUSM+5  STA  2,3  Reject  user 

Data:  Line  ID 

Use:  Report  trace  if  not  already  done 

97  TSSN  MUST3+2  AkfDX  ,2,0  UST  compression 

Data:  Old,  new  UST  addresses 

Use:  Update  UST -oriented  tables  with  new  address 
|  98  TSSN  ASGGC1+4  ASQ  1,0  UST  area  increase  tyr  IK 

Data:  None 

Use:  Decrement  amount  of  user  memory  available 


99  TSSH  REUf  03+18  ASA  1,0  UST  tnt  decrease  by  IK 

Cats :  Hone 

Use:  Zncreaeat  amount  of  user  memory  available 

100  TSSH  RIO. 1 A* 2  LDA  .LBUF.2  Terminal  1/0  request 

Data:  UST  addreas,  CSOUT  opcode,  CRUH  flags  ( .LFLG2) 

Use:  Open  interval  closed  by  trace  19,  20,  84,  86,  87,  or  103 

101  TSSH  R.SFOK+2  LXL1  -1,6  Process  command  file 

t*  function 

Data:  UST  address,  text  (ASCII) 

Use:  Provide  information  to  explain  changes  in  .LPLC2  or  to  explain 
trace  6 

76  TSSH  TRMC7+7  EAQ  O.AU  Cancel  CRUH  mode 

Data:  UST  address,  flags  (.LFLG2) 

Use:  Force  clearing  of  subsystem  name  stack 

102  TSSH  IRIHQ+4  LDH  -0050000, DU  Issue  remote  inquiry 

GEROUT 

Data:  Hone 

Use:  Open  Interval  closed  by  next  trace;  if  long  time,  indicates 
roadblock  of  TSS  in  .KROUT  because  a  table  is  full 

103  TSSO  760CC+3  CMPA  4,DL  VIP  input  complete 

Data:  UST  address,  flags  (.LFLG2),  GEROUT  status 
Use:  Completes  terminal  I/O  started  by  trace  100 


5.2.10.2  Formats  of  TSS  Traces.  The  collection  routine  executing 
within  TSS  passes  two-word  traces  to  CMC  through  the  dispatcher  trace 
mechanism.  In  all  but  four  trace  subtypes,  GMC  stores  the  A  and  Q 
registers  after  a  record  control  word  and  RSCR  time  to  form  a 
four-word  logical  record.  For  subtype  24,  CMC  edits  the  Q  register 
before  making  a  trace.  For  subtype  32,  GMC  additionally  retrieves  the 
USERID  and  appends  it  to  form  a  6-word  logical  record.  Subtype  105  is 
generated  internally  by  GMC  when  it  senses  a  period  of  no  TSS 
activity.  Logical  records  generated  as  a  result  of  receipt  of  a 
subtype  90  trace  may  have  3  different  formats.  The  first  format  is 
written  whenever  TSS  passes  a  subtype  90  trace  to  GMC  indicating  that 
the  TSS  traces  have  been  turned  off.  GMC  passes  this  trace  to  the 
data  tape  in  the  standard  format  described  above.  The  following  2 
formats  apply  to  a  group  of  logical  records  written  by  GMC  whenever  it 
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recovers  fro*  a  lost  data  condition  or  whenever  TSS  passes  a  subtype 
90  trace  indicating  that  TSS  traces  have  been  turned  back  on.  In  the 
latter  case,  CMC  forests  a  series  of  logical  records;  one  16-vord 
record  is  written  for  each  user  logged  onto  TSS  to  give  such 
attributes  as  UST  address,  line  ID,  USERID,  and  current  program 
stack.  A  four-word  record  which  indicates  the  TSS  swap  area  limits  is 
the  last  record  written  by  CMC  in  the  sequence  of  records.  If  no 
user*  are  logged  on  to  TSS  and  if  TSRI  is  not  running,  only  the 
four-word  record  is  written.  The  Monitor  source  code  gives  the 
formats  of  all  TSS  record  subtypes. 

5*2.10.3  Installation  Procedures 


5.2.10.3.1  Description  of  Monitor  Software,  lbe  TSS  Monitor  program 
element  executes  as  a  TSS  master  subsystem.  Vhen  the  master  user  logs 
on  and  requests  the  program  via  "SYST  GMP",  it  first  perfores  address 
relocations  needed  to  convert  relative  offsets  into  actual  slave 
addresses.  Then  it  verifies  that  the  instructions  at  the  trace  points 
match  those  in  the  monitor  coding  (the  originals  have  transfer 
instructions  patched  over  them  vhen  the  monitor  is  executing).  Any 
mismatches  found  are  reported  on  the  master  terminal  and  verification 
continues.  If  any  mismatches  are  found,  the  subsystem  terminates 
without  making  any  modifications  to  TSS.  If  verification  succeeds, 
the  master  subsystem  copies  part  of  itself  (about  IK  memory)  into  an 
area  of  module  TSSO  which  was  reaarved  by  means  of  patches  on  the  TSS 
IMIT  file.  The  origin  of  this  area  is  the  UST  address  origin  which 
TSS  would  have  used  were  the  patch  not  applied.  lext,  the  master 
subsystem  applies  Execute  Double  (XED)  instructions  to  most  trace 
points  to  save  the  return  address  and  indicator  register.  In  a  few 
rare  cases,  unconditional  transfer  instructions  are  used.  The  master 
subsystem  terminates  at  this  point.  As  TSS  continues  execution, 
traces  will  be  formatted,  but  a  switch  prevents  COOS  traces  from  being 
written.  If  the  console  operator  enters  "TS1  TRACE  ON”,  GCOS  traces 
will  be  written  from  TSS,  and  an  additional  overhead  will  be  imposed 
on  TSS.  ftie  console  operator  can  stop  generation  of  TSS  traces  with 
"TS1  TRACE  OFF".  The  traces  can  be  turned  on  and  off  multiple  times. 

5.2.10.3*2  Software  Installation.  Vhen  TSS  is  loading  the  monitor 
subsystem,  it  must  be  able  to  accesa  the  program  element  using  a  MKE 
CECALL  to  a  BCD  name  of  .TSGM7.  The  program  element  may  reside  either 
on  a  system  file  defined  in  the  EDIT  and  FILES  sections  of  the  boot 
deck  or  on  a  permanent  file  (B29IDPX0/GMFC0L/TSS/TSSM0K)  dynamically 
accessed  during  TSS  startup.  Use  of  a  permanent  file  requires 
additional  patches  In  the  TSS  INIT  file.  This  is  the  current  method 
of  implementation  because  it  does  not  require  changes  to  the  GCOS 
startup  deck.  The  job  stream  used  to  create  the  program  element  is 
located  on  file  B29IDPX0/GKFC0L/TSS/TSGMF. 
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If  the  system  file  option  is  to  be  used,  the  4  FRMPL  Q*  card  on  file 
TSOI?  must  be  replaced  with  a  $  TAPE  Q*,X2D, , , ,TSSM0N  card.  The 
startup  file  must  be  defined  in  the  EDIT  section  of  the  boot  deck  as 
follows: 

4  PILDEP  ST1.TSSM0N, 12/0, SYS, 1T1 

The  tape  drive  name  1T1  must  be  appropriate  to  the  hardware 
configuration.  If  this  startup  file  is  appended  onto  an  existing  edit 
tape,  replace  "1T1"  with  In  the  PILES  section,  insert  the 

following  card  in  front  of  existing  4  SYSTEM  cards: 

4  SYSTEM  TSSMON 

If  a  permanent  file  is  used,  no  changes  need  to  be  made  to  the  boot 
deck,  and  no  system  interruption  occurs  during  installation  of  the  TSS 
Monitor. 

5.2.10.3.3  Software  Activation.  The  purpose  of  this  section  is  to 
describe  how  TSS  builds  tables  so  that  the  master  user  can  find  a 
subsystem  named  "CMP". 

5.2.10.3.3.1  Overview  of  TSS  INIT  Pile  Changes.  The  TSS  INIT  file  is 
a  quick  access  permanent  file  named  TS1  normally  residing  under  USERID 
OPNSUTIL,  the  default  USERID  for  TSS  and  patchable  as  symbolic 
location  .TUSER.  Pile  TS1  is  read  as  soon  as  TSS  starts  in  order  to 
pass  parameters  for  the  current  loading  of  TSS  or  to  allow  symbolic 
specification  of  site  option  patches  such  as  the  maximum  number  of 
concurrent  users.  The  TS1  file  has  two  sections:  4lNF0  to 
symbolically  define  site  option  parameters,  and  4PATCH  to  apply 
patches  to  TSS  beyond  those  already  in  the  PATCH  section  of  the  boot 
deck.  Installation  of  the  TSS  monitor  requires  that  patches  be  placed 
at  the  end  of  the  TSS  INIT  file.  These  patches  are  located  on  file 

|  B29IDPXO/GKFCOL/TSS/TSS.PAT  (updated  for  SRN7.2.3). 

5.2.10.3.3.2  Definition  of  the  Master  Subsystem  Name.  The  following 
three  patch  cards,  included  in  file  TSS. PAT,  overlay  an  unused  program 
descriptor  in  TSSA  to  define  a  subsystem  named  GMF  with  an  edit  name 
of  .TSGMF  and  attributes  .BPRIV  (can  execute  privileged  DRLs,  not 
currently  used),  .BMAST  (master  subsystem),  and  .BMASX  (permission  to 
alter  TSS  executive  with  a  DRL  instruction,  not  currently  used): 


6735 

OCTAL 

147155146040 

CMP 

.MTIMS 

6736 

OCTAL 

336362274426 

. TSGMP 

.MTIMS 

6737 

OCTAL 

40003 

MASTER  SUBSYSTEM 

.MTIMS 

5.2.10.3.3.3  Definition  of  the  UST  Origin.  The  following  patch, 
included  in  file  TSS. PAT,  disable  the  check  for  overlaying  VIP  code  in 
TSSO  when  no  VIPs  are  configured.  The  patch  NOP's  out  a  conditional 
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transfer  instruction  around  an  instruction  which  loads  a  UST  origin 
corresponding  to  the  case  when  VIPs  are  configured.  The  letter  "0"  in 
these  patches  means  that  the  patch  locations  are  with  respect  to  the 
origin  of  TSSO,  not  to  slave  address  0  of  TSS  as  in  the  subsystem 
attribute  patches.  The  letter  "R"  before  patch  content  indicates  that 
the  address  field  of  the  patch  must  be  relocated  to  the  beginning  of 
the  module  (TSSO)  identified  in  column  7  of  the  patch.  The  symbolic 

I  addresses  for  this  patch  is  IN7630+2.  The  following  patch  completes 
the  TSS  INIT  file  when  the  TSS  Monitor  is  loaded  on  a  startup  file: 

j  5275  OOCTAL  11007  DON'T  TRANSFER  IF  NO  VIPS  .MTIMS 

5.2.10.3.3.4  Installation  from  a  Permanent  File.  The  principle  of 
this  method  is  that  if  a  job  has  a  file  code  **  active,  any  MME  GECALL 
will  cause  that  file  to  be  searched  before  the  system  files  are 
searched.  Patches  on  the  TSS  INIT  file  must  access  a  permanent  file 
before  subsystem  loading  begins  and  release  it  after  subsystem  loading 
finishes.  The  TSS  User  Derail  Loader  (TUDL,  SDN  K79005)  provides  an 
example  of  how  to  access  and  release  a  system  loadable  file  from 
within  the  TSS  executive.  The  patches  on  the  TSS  INIT  file  for  the 
TSS  Monitor  are  compatible  with  TUDL  because  TUDL  starts  its  work 
after  the  TSS  Monitor  has  finished  its  work.  TUDL  may  further 
relocate  the  UST  origin  upward,  but  TUDL  uses  the  address  stored  by 
the  TSS  Monitor  patches.  A  list  of  an  entire  $  PATCH  section  is  given 
in  the  TSSM  source  code.  The  first  four  patches  match  ones  described 
in  earlier  sections. 

The  bulk  of  the  patches  used  to  access  a  permanent  file  are  placed  in 
an  area  of  TSSO  which  is  reserved  for  UST  space  when  the  UST  origin  is 
not  adjusted  upward  (actually,  this  space  was  needed  in  releases  prior 
to  W7.2.0  to  prevent  the  generation  of  a  UST  for  TSR1  from  destroying 
the  code  at  the  end  of  TSS  startup;  with  the  TSS  INIT  file  feature, 
enough  code  intervenes  so  that  this  buffer  is  not  necessary  and  so 
that  the  UST  origin  can  be  moved  up  and  not  destroy  the  code).  The 

| first  transfer  into  these  patches  is  at  offset  4677  in  TSSO  (symbolic 
offset  DMYERR+2,  instruction  EAX5  l) .  Here,  the  USERID  in  the  file 
structure  defined  in  the  patches  is  stored  into  the  SSA  of  TSS  at 
location  .SUID.  Then  a  MME  GEMORE  accesses  the  file.  Use  of  .SUID 
makes  FKS  think  that  the  file  is  being  accessed  by  its  owner  and  thus, 
no  special  permissions  are  needed.  If  the  MME  GEMORE  is  denied,  a 
flag  is  set  to  0  before  control  returns  to  the  original  TSSO  coding 
from  offset  2013  in  the  patches. 

The  second  transfer,  at  octal  offset  5272  (symbolic  location  IN7630+3) 
replaces  the  instruction  defining  the  UST  origin.  The  patch  at  octal 
offset  5271  remains  intact.  In  the  second  group  of  patches,  the 
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permanent  file  auet  be  released,  the  nun be r  of  PATs  decremented  by 
one,  and  the  USERID  In  .SUID  set  to  aero*  If  the  GDIORE  In  the  first 
set  of  patches  is  denied,  then  the  number  of  PATs  is  not  decremented* 
This  alteration  of  word  .SRPAT  in  the  SSA  is  necessary  because 
releasing  the  file  does  not  remove  the  file  code  from  the  SSA.  The 
patch  for  defining  the  new  UST  origin  is  moved  to  octal  offset  2031  in 
the  patches,  just  before  a  transfer  back  to  TSSO. 

5*2.10*4  Production  Use  of  the  Monitor.  The  following  steps  must  be 
taken  to  enable  data  capture: 

1*  Append  patches  on  file  B29IDPX0/ OCFCOL/TSS/TSS . PAT  to 
Timesharing  HIT  Pile  (normally  OPHSUTIL/TSl). 

2.  Run  the  file  B29IDPX0/GMPC0L/TSS/TSGM?  to  create  the  new  TSS 
subsystem  file. 

3*  Start  T31. 

4*  Start  a  copy  of  QIC  with  the  TSS  Monitor  active* 

5*  Log  on  to  a  master  USERID  and  enter  "SYST  GMP"  at  the 

prompt  following  log-on  to  install  the  hooks  into  TSS  code. 
Traces  will  not  start  at  this  time.  The  master  USER IDs 
assembled  in  TSS  are  MASA  and  KASB.  Unless  these  are  patched 
or  redefined  in  the  TSS  IIIT  file,  only  a  terminal  ID 
designated  as  master  in  .KSECH  may  be  used  for  this  step. 

6.  Enter  "TS1  TRACE  OH"  from  the  system  console  to  start 
generation  of  traces  from  TSS. 

7.  Enter  "TS1  TRACE  OPF"  from  the  system  console  to  suspend 
generation  of  traces. 

Steps  (5),  (6)  and  (7)  execute  Independently  of  (4);  however,  use  of 
step  (6)  without  use  of  step  (4)  will  cause  unnecessary  TSS  overhead 
if  traces  are  being  generated  and  lost  due  to  GMC  not  being  in 
execution.  Steps  (6)  and  (7)  may  be  repeated  multiple  times  if  traces 
are  to  be  captured  for  specific  periods  of  the  day.  (The  companion 

tata  reduction  program,  described  in  section  15  cannot  process  CMC 
esslons  longer  than  9  hours) . 

5.2.10.5  Monitor  Limitations.  To  obviate  lockup  fault,  initial  trace 
generation  (at  "T81  TRACE  OH",  and  following  lost  data)  is  inhibited 
until  the  number  of  TSS  users  falls  below  50.  further,  all  trace 
generation  is  inhibited  until  the  number  of  users  falls  below  100. 

(The  companion  data  reduction  program  is  limited,  by  parameter,  to  50 
active  USTs). 
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4 


5 


0-29 

CP  time  usage 

30-35 

ION  command 

0-5 

Not  used 

6-35 

SNUMB 

6  0-5  Device  command 

Special  note: 

A  command  of  octal  72  to  a  permanent  disk 
pack  indicates  that  a  pack  exchange  is  in 
progress.  The  .MGP66  module  issues 
another  standby  command  to  the  device  to 
which  the  permanent  pack  is  to  be  moved. 

A  special  device  name  record  should  appear 
either  in  the  current  block  on  tape  or  at 
the  beginning  of  the  next  block  to  confirm 
the  pack  exchange. 


6-17 

DCV  length 

16-35 

File  origin  block  number 

7 

0-13 

File  size 

14 

Sysout  flag 

15 

Seek  flag 

16-35 

Seek  address 

8 

0-5 

Device  command 

6-11 

Device  number 

12-17 

IOM  PUB  number 

18-23 

I/O  Command 

24-26 

Not  used 

27-29 

IOM  number 

30-35 

Record  count 

9 

0-17 

MBA  of  job  issuing  connect  or  zero  if 
nonextended  memory 

18-35 

I/O  queue  address  in  SSA  (absolute 
address  if  nonextended  memory  or  if 
value  less  than  64K;  relative  address 
(to  MBA)  if  extended  memory  and  if  value 
greater  then  64F) 

1  10 

0-8 

Not  used 

1 

9-17 

Activity  number 

18 

Flag  (=1  -  I/O  status  is  stopped) 

19-28 

Not  used 

29-34 

I/O  status  (I/O  queue  word  0) 

35 

Flag  (“1  -  system  job) 

11 

0-35 

.CRCOM  (use  only  bits  18-29) 

12 

0-35 

•CRCOM+2  (use  only  bits  18-29) 

13 

0-35 

.SGCPT  (use  only  bits  18-35) 

14 

0-35 

•SRQCT  (use  only  bits  0-17) 

15 

0-35 

.SNIO  (use  only  bits  18-35) 

(words 

11-15  are  not  generated  if  the  IDENT  for  a  job  is  reported) 

11-20 

IDENT  (words  11-14  described  above  are  not 
collected) 

21-22 

USERID 
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5.4*3«2  MSM  Special  Record.  During  the  execution  of  MSM  a  special 
record  is  written  at  preselected  times  during  the  monitoring  session. 
These  records  are  used  to  analyze  SSA  cache  core,  when  configured. 

The  format  of  this  record  is  shown  below.  This  record  is  based  on 
data  collected  during  the  processing  of  GMC  trace  events  73  or  76. 


i 
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Word 


Infomatlon 


Bits 

1  0-17 

18-26 
27-35 

2-516  0-35 


517  0-35 

518  0-35 


Size  of  rtcord  (-51 7 ) 

Not  used 

Octal  7  (tract  master ) 

a.  Number  of  tlMS  each  6C0S  module  1-515 
ms  In  the  SSA  cache  buffer 

b.  Number  of  times  each  6C0S  nodule  1-515 
ms  loaded  ty  an  I/O  because  It  was  not 

In  the  SSA  cache  buffer. 

Not  used 

Flag  (-2  -  case  a.  above) 

(■3  -  case  b.  above) 


5.4.3.3  Device  None  Record.  If  either  MSM  or  CM  Is  active,  the  CMC 
writes  a  record  which  correlates  device  naaes  to  device  addresses. 

The  Systea  Configuration  Naae  table  Is  processed  sequentially  as  this 
record  Is  foraatted.  Naaes  for  all  disk  devices  are  reported.  In 
order  to  detect  exchanges  of  aass  storage  devices,  GMC  periodically 
exaalnes  the  device  naae  table.  If  any  changes  have  occurred,  then 
another  device  nine  record  Is  written.  This  record  Is  variable  In 
size,  and  recognized  by  the  special  fornat  of  the  second  word. 


Word 

Bits 

Information 

1 

0-17 

Size  of  record 

18-26 

Not  used 

27-35 

Octal  7  (trace  number) 

2 

0-35 

Octal  535353535353 

3-n 

0 

Flag  (■!  -  fixed  device  If  mass  storage) 
(set  If  bits  13  and  14  of  the  second  word 
of  the  SCT  entry  for  any  mass  storage 
device  are  both  zero;  In  Shared  Mass 
Storage  environment,  shared  devices  must 
be  fixed) 

1-5 

IOM  number 

6-11 

Channel  number 

12-17 

Device  number 

18-35 

Device  name  In  BCD 

5.4.3 .4 

FILSYS  Catalog  Structure  Record.  During  the  execution  of 

the  Mass  Store  Monitor, 

certain  Hrfc  befstes  GCOS  Trace  15  data  are 

collected  concerning  the  catalog  file  string  that  Is  being 
referenced.  The  purpose  of  this  data  Is  to  try  and  detenalne  how  many 
connects  are  being  made  because  of  the  particular  structure  of  a  given 
catalog  or  file.  This  data  Is  also  used  to  provide  the  catalog  file 
string  naae  associated  with  the  various  user  file  codes  that  are 
reported  by  the  Mass  Store  Monitor.  ff€  SEFSYE  traces  are  only 
processed  If  generated  by  a  system  program  (program  number  less  then 
15  or  FSTSLV).  In  addition,  only  the  following  GEFSYE's  will  be 
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37/PA1C 


38/ACTIVE 

39/MAP 

47/OUT 


TAble  6-1.  (Part  4  of  4) 

Other  Reporta 

Peripheral  Allocator  Report 

Activity  Report /Erceaai re  Raaource  Report/Abort 
Report/IDEET  Report 

Meaozy  Map 

Out  of  Core  Report 

Special  Job  Meaox7  Reporta 

Sye tea  Prograa  Ueage  Report 

Meaory  Statiatiea  Report 

Dlatribution  of  Urgency  Over  Time  Report 

Zero  Urgency  Job  Report 
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cover  a  larger  range  of  values.  This  change  could  be  made  via  data  cards 
and  would  not  increase  the  size  of  the  program. 


33ie  second  method  would  involve  increasing  the  size  of  the  histogram  by 
altering  the  value  of  TABSIZ.  As  long  as  the  size  requested  does  not 
exceed  50,  this  change  can  also  be  done  via  a  data  card.  However,  if  an 
individual  histogram  needs  to  be  larger  than  50  buckets,  the  user  will 
need  to  change  the  value  of  MXTBSZ.  This  change  will  require  a  change  to 
source  code,  a  recompile,  and  probably  an  Increase  in  program  size.  All 
references  to  MXTBSZ  must  be  altered.  Wiis  would  need  to  be  done  in  the 
EDIT  subsystem  of  Timesharing. 

The  remaining  items  that  can  be  modified  are  the  title  and  the  vertical 
axis  headers.  The  method  for  altering  the  histogram  parameters  is 
detailed  in  subsection  6.1.6.  Table  6-2  shows  the  default  values  for  all 
histograms. 

6.1.4  Plot  Options.  There  are  three  characteristics  directly  available 
to  the  user  for  each  individual  plot  axis  used. 

The  first  characteristic,  MAXNUM,  is  the  maximum  number  of  entries  to  be 
plotted  on  each  vertical  plot  axis. 

The  second  characteristic,  YMAX,  defines  the  upper  limit  of  the  horizontal 
display  axis. 

The  third  characteristic,  YMIN,  defines  the  lower  limit  of  the  horizontal 
display  axis.  The  method  for  altering  these  values  is  explained  in 
subsection  6.1.7.  Table  6-3  shows  the  default  values  for  all  plots. 

6.1.5  Default  Option  Alteration.  The  general  format  for  an  option 
request  is  as  follows:  the  first  card  contains  an  action  code  describing 
the  action  to  be  taken.  Subsequent  cards  modify  report  parameters  for 
some  of  the  action  codes.  All  input  cards  are  free  format  with  the  only 
requirements  being  that  at  least  one  blank  space  separates  multiple  input 
parameters.  The  very  last  input  card  must  have  the  word  "END"  typed  on 
it.  This  card  must  be  present  whether  or  not  any  other  input  options  are 
selected.  Available  actions  with  their  (default)  implications  are  shown 
in  table  6-4.  There  is  no  order  required  for  the  options.  In  reading  the 
following  sections  it  should  be  remembered  that  the  first  card  for  any 
input  option  must  be  the  action  code  specification  with  no  other  data 
present  on  the  card. 
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Table  6-2.  Default  Values  for  Histograms  (Part  1  of  2) 


j 


ID  # 

Low  Value 

Interval  Size 

Number  of  Buckets 

1 

4 

4 

50 

2 

0 

50 

50 

3 

0 

250 

50 

4 

0 

1 

50 

5 

4 

4 

50 

6 

0 

1 

50 

7 

0 

5 

50 

8 

0 

200 

50 

9 

0 

200 

50 

10 

•  95 

.1 

50 

11 

4 

4 

50 

12 

0 

10 

50 

13 

0 

1 

50 

14 

0 

1 

50 

15 

0 

1 

50 

16 

0.0 

5.0 

50 

17 

0 

10 

50 

18 

200  * 

2  * 

50 

19 

0 

20 

50 

20 

0 

1 

50 

21 

0 

1 

50 

22 

0 

8 

50 

23 

0 

25 

50 

24 

0 

25 

50 

25 

50 

2 

50 

29 

0 

1 

50 

30 

0 

1 

50 

31 

0.0 

5.0 

50 

32 

0.0 

5.0 

50 

33 

0.0 

5-0 

50 

34 

0.0 

5.0 

50 

35 

5 

5 

50 

36 

5 

5 

50 

40 

0 

1 

50 

41 

0 

1 

50 

42 

0 

10 

50 

43 

0 

1 

50 

44 

0 

1 

50 

45 

0 

10 

50 

46 

0 

1 

50 

48 

0.0 

5.0 

50 

49 

0.0 

5-0 

50 

50 

0.0 

5.0 

50 
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Table  6-2.  Default  Valuee  for  Histograms  (Part  2  of  2) 


ID  # 

Lov  Value 

Interval  Size 

Number  of  Buckets 

51 

0.0 

5.0 

50 

52 

0.0 

5.0 

50 

53 

0 

1 

50 

54 

0 

250 

50 

55 

0.0 

5.0 

50 

56 

0.0 

5.0 

50 

57 

0.0 

5.0 

50 

58 

0.0 

5.0 

50 

-  The  Lov  Value  and  Interval  Size  parameters  for  histogram  ID  #18  are 
determined  by  the  program  in  a  dynamic  manner. 

LOV  VALUE  -  MEMORY  CONFIGURED  ON  SYSTEM  - 
PRECODED  LOV  VALUE  (200) 

INTERVAL  SIZE  -  ([MEMORY  CONFIGURED  ON  SYSTEM ]* 

'PRECODED  INTERVAL  SIZE  (2)]  - 
'CALCULATED  LOV  VALUE] )/NUMBER  OF  BUCKETS 
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Table  6-3.  Default  Values  for  Plots 


TP  # 

Max  Size  of  Plot 

Lover  Plot  Limit 

Upper  Plot  Limit 

26 

Unlimited 

0. 

456. 

27 

Unlimited 

0. 

456. 

28 

Unlimited 

0. 

114. 

59 

Unlimited 

0. 

e 

The  Upper  Plot  Limit  for  Plot  ID  #59  is  determined  dynamically  by  the 
program. 

Limit  ”  Memory  configured  on  System  *  Histogram  Interval  Size  for 
Histogram  ID  #18  that  is  originally  coded  into  the  program 
(set  to  a  value  of  2) 


Table  6-4.  Available  Report  Actlona  and  Their  (Default)  Values 
(Part  1  of  2) 


HIST6  -  Modify  a  histogram  (see  table  6-2) 

PLOT  -  Modify  a  Plot  (see  table  6-3) 

OR  -  Turn  a  specific  report  on  -  (all  reports  on  except  Memory  Map  and 
Out  of  Core  Report) 

OFP  -  Turn  a  specific  report  off  -  (all  reports  on  except  Memory  Map  and 
Out  of  Core  Report) 

TIME  -  Set  a  timespan(a)  for  reporting  -  (total  time  reported) 

ALLOFF  -  Turn  all  reports  off  except  those  specified  -  (all  reports  on 
except  Memory  Map  and  Out  of  Core  Report) 

ALLOH  -  Turn  all  reports  on  except  those  specified  -  (all  reports  on 
except  Memory  Map  and  Out  of  Core  Report) 

ERROR  -  Do  not  stop  on  an  option  request  error  -  (stop  on  an  input  error) 

DEBUG  -  Program  debug  requested  -  (no  debug) 

ALLOC  -  Stop  program  after  a  specified  number  of  memory  allocations  have 
been  requested  -  (entire  tape  processed) 

MREC  -  Stop  program  after  a  specified  number  of  tape  records  have  been 
processed  -  (entire  tape  processed) 

HOUSER  -  Do  not  print  USERID  on  any  report  -  (USERID  printed  on  certain 
reports) 

IDLE  -  Turn  off  all  Idle  Monitor  reports  -  (all  IDLE  reports  on) 

VASTED, CORE, 10, CPU, RATIO, URG  -  Changes  parameters  used  in  the  Excessive 

Resource  Usage  Report  -  (20K,50K,30MIH, 
30MIH,5»40) 

ABORT  -  SHUMBs  not  to  report  in  the  ABORT  Report  -  (all  SHUMBs  that 
abort  are  reported) 

PLTIHT  -  Change  Interval  at  which  plots  are  printed  -  (10  MIB) 

FSTSLV  -  Change  the  lowest  allowable  user  program  number  -  (14  decimal) 
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■ust  be  expressed  as  four  character  fields  with  no  intervening  blanks. 
Time  is  based  on  a  24-hour  clock.  If  a  user  wants  to  request  the  time 
4:07,  he  must  input  0407*  All  times  must  include  four  characters. 


If  a  start  time,  but  no  stop  time,  is  desired,  no  characters  should  be 
entered  after  the  minutes  of  the  start  time.  If  a  stop  time  is  requested, 
there  must  be  a  start  time  corresponding  to  it.  If  the  user  wants  to 
start  at  the  beginning  of  data  collection  and  stop  at  some  specified  time, 
but  is  not  sure  of  the  start  time,  a  start  time  of  0001  should  be  used. 
Figure  6-5  shows  the  format  for  this  option. 

6.1.11  Turn  All  Reports  Off  Except  Those  Specified  (Action  Code  ALLOFF). 
All  reports  except  those  explicitly  identified  here  are  to  be  turned  off. 
The  inputs  consist  of 


A  B  C  .  .  .  T  (max  of  25) 

where  A  through  Y  are  the  report  IB  numbers  (table  6-l)  to  be  turned  on. 

The  format  is  shown  in  figure  6-6.  This  option  will  control  the  printing 
I  of  all  reports,  including  histograms,  if  they  contain  a  specific  IB  number. 

6.1.12  Turn  All  Beporta  On  Ercept  Those  Specified  (Action  Code  ALLOH) . 

All  reports  except  those  explicitly  identified  here  are  to  be  turned  on. 

The  input  consists  of 


A  B  C  .  .  .  Y  (max  of  25) 

A  through  Y  are  the  report  ED  numbers  (table  6-l)  to  be  turned  off.  The 
format  is  the  same  as  action  code  ALLOFF  (see  figure  6-6).  This  option 
I  will  control  the  printing  of  all  reports,  including  histograms,  if  they 
contain  a  specific  ID  number. 

6.1.15  Continue  Data  Reduction  After  an  Input  Option  Error  (Action  Code 
ERBOR ) .  This  code  allows  data  reduction  to  continue  when  an  error  has 
been  detected  and  reported  in  an  input  option  request.  The  default  value 
will  abort  data  reduction  and  report  the  error.  Only  the  Action  Code  card 
is  required. 

6.1.14  Debug  For  a  Given  Program  Number  (Action  Code  DEBUG).  This  is  a 
debug  option  which  supplies  large  amounts  of  output  for  a  given  program 
number.  It  should  be  used  only  in  cases  of  data  reduction  problems.  Card 
1  contains  the  word  DEBUG  and  card  2  contains  a  program  number.  A  program 
number  of  -150  will  provide  detailed  debug  on  system  scheduler  activities. 

6.1.15  Stop  After  a  Specified  Humber  of  Tape  Records  Processed  (Action 
Code  WREC).  This  option  is  useful  when  a  tape  problem  occurs  and  the 
entire  tape  cannot  be  processed.  Hhen  this  occurs,  the  program  will 
usually  abort  with  an  I/O  error  and  some  reports  might  be  lost.  If  a  tape 
error  does  occur  during  data  reduction,  the  operator  should  type  a  "U"  in 
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where 

A  -  The  word  TIME 

If  •  Report  NAME  to  be  tioe  spanned  (table  6-1) 

M  -  Number  of  different  times  appearing  on  Card  3*  N  and  M  must  be 
separated  by  at  least  one  blank. 

B,C,D,E  -  Start  and  stop  times  used  to  define  the  time  spans. 

Times  must  be  separated  by  one  or  more  blanks. 


For  each  report  that  is  to  be  timespanned,  the  entire  sequence  of  three 
cards  must  be  repeated. 


Figure  6-5«  TIME  Action  Code  Format 


Card  1  -  A 

2  -  N 

3  -  B  C  D  E  ... 


where 

A  -  The  word  ALLOFF  or  A LION 

N  -  The  number  of  report  IDs  appearing  on  card  3  cannot  exceed  23 
B,C,D,E  ■  ID  number  of  those  reports  not  to  be  turned  07F/0N.  All  numbers 
must  be  separated  by  at  least  one  blank. 


Figure  6-6.  ALLOFF/ALLON  Action  Code  Format 
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In  order  to  stop  date  reduction  processing  prior  to  the  tape  error.  The 
first  card  contains  the  word  NREC  and  the  second  card  contains  the  nuaber 
of  Cape  records  to  be  processed. 

6.1.16  Suppress  USERID  (Action  Code  NOUSER).  This  action  code  is  used  to 
suppress  the  printing  of  USERIDa  on  those  reports  where  the  USERID 
normally  appears.  Only  the  Action  Card  is  required. 

6.1.17  Turn  Idle  Reports  Off  (Action  Code  IDLE).  This  option  will  turn 
off  histograms  dealing  with  idle  CTO  Information  (l.e.,  report  IDs  16.  19, 
20,  21,  22,  31-34,  43-54,  55-58).  The  user  should  realize  that  these 
reports  are  useful  In  determining  the  I/O  boundness  of  the  system. 

However,  on  most  systems,  the  idle  trace  is  70  percent  of  the  entire  tape, 
so  that,  by  turning  off  this  processing,  processing  time  can  be  reduced  by 
over  50  percent.  Only  the  action  card  is  required  for  this  option. 

6.1.18  Change  Excessive  Resource  Limits  Used  In  Excessive  Resource  Report 
(Action  Code  WASTED,  PORE,  10,  CPU,  RATIO  and  URG) .  This  report  lists  all 
jobs  which  are  above  a  preset  threshold  for  any  of  the  following  resources: 

Wasted  Memory 
Excessive  Memory 
Excessive  CPU  time 
Excessive  I/O  time 
Excessive  Ratio 
Excessive  Urgency 

These  limits  are  currently  set  to  the  values  specified  in  table  6-4  and 
may  be  changed  by  using  this  option.  The  format  for  this  option  consists 
of  Card  1  specifying  the  action  code  and  Card  2  specifying  the  new 
threshold  limit.  This  report  is  explained  later  in  this  chapter. 

6.1.19  Eliminate  S NUMB a  From  Abort  Report  (Action  Code  ABORT).  This 
report  lists  all  activities  that  fail  to  go  to  EOJ  (i.e..  Abort).  The 
details  of  the  report  are  given  in  section  6.3  of  this  chapter.  At  times, 
jobs  are  designed  in  such  a  way  that  they  can  be  terminated  only  via  a  MME 
GEBORT  or  operator  command.  While  these  jobs  do  not  go  to  EOJ,  they  have 
processed  correctly  and  have  not  resulted  in  wasted  computer  resources. 

This  option  allows  the  user  to  request  that  these  jobs  not  be  Included  in 
the  Abort  Report.  The  first  card  contains  the  action  code  ABORT.  The 
second  card  contains  the  number  of  jobs  that  will  be  deleted  from  the 
Abort  Report.  This  nuaber  may  not  exceed  10.  If  more  than  10  jobs  are 
listed,  only  the  first  10  will  be  deleted.  The  third  card  contains  the 
SNUMB  of  each  job  to  be  deleted.  Each  SNUMB  must  be  followed  by  at  least 
one  blank  column. 

6.1.20  Change  the  Plot  Interval  (Action  Code  PLTINT).  Currently,  all 
plots  are  outputted  at  10-mlnute  Intervals.  The  plot  interval  controls 
the  output  of  all  plots;  l.e.,  one  plot  cannot  have  a  different  time 
Interval  than  another  plot.  The  first  card  of  this  option  contains  the 
action  code  PLTINT.  The  second  card  contains  the  new  plot  interval 
Inputted  in  minutes. 
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6.2  Process! 


6.2.1  General.  The  reports  of  the  MULRP  are  intended  to  aid  in  the 
following: 

o  System  sizing  -  both  memory  sizing  and  processor  utilization. 

o  Job  flow  analysis  -  determining  if  and  where  a  bottleneck  exists 
and  the  user  memory  loading  and  the  daily  load  distributions. 

o  System  perturbation  measures  -  allows  the  user  to  evaluate  how  a 
new  procedure  or  new  load  may  alter  the  utilization  of  the  system 
as  well  as  determine  the  total  utilization/capacity  of  the  system 

o  Large  user  jobs  -  aid  in  identifying  specific  jobs  which  are 
misusing  or  "hogging"  system  resources. 

Figure  6-7  illustrates  how  the  monitor  will  pinpoint  these  various  areas. 
For  example,  if  the  monitor  indicates  a  large  percentage  of  processor 
idleness  with  high  memory  demand  and  low  memory  availability,  a 
dispatching  or  I/O  bottleneck  would  be  indicated.  This  would  be  caused  by 
the  I/O  not  completing  its  services  in  a  sufficiently  timely  manner  to 
allow  full  use  of  the  processors.  If  processor  use  was  very  high  and 
memory  demand  and  availability  were  high,  a  memory  allocation  bottleneck 
or  an  overloaded  processor  would  be  indicated. 

6.2.2  JCL.  Figure  6-8  presents  the  JCL  needed  to  run  a  total  MUM 
reduction.  The  following  points  describe  key  features  of  the  required  JCL 

|  o  74K  required  for  memory 

o  15K  sysout  requirement  would  vary  depending  on  amount  of  data 
collected.  This  figure  would  be  significantly  higher  if  the 
Memory  Map  or  Out  of  Core  Report  were  produced. 

o  The  DATA  I*  card  is  used  to  indicate  the  presence  of  data  cards. 
All  data  cards  must  immediately  follow  this  card.  At  least  one 
|  data  card  must  be  present.  That  card  will  contain  the  word  END 

and  is  used  to  signify  the  end  of  input  data.  The  END  card  must 
be  present  even  if  no  other  data  cards  are  desired.  The  data 
cards  shown  in  the  example  are  those  recommended  for  most 
analyses. 

o  An  additional  8K  will  be  required  to  load  the  MUM  reduction 

program,  but  this  8K  will  be  released  immediately  upon  loading. 

o  A  JCL  file  is  already  established  for  the  user  under  the  file 
B29IDPX0/ JCL/MUM. 
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3  IDENT 
$  SELECT 
i  TAPE 
1?  LIMITS 
$  DATA 


18202 51/30/3044, C702 
B29IDPX0/0BJECT/HUM 
01, AID, ,18897 
999,74K,-4K,15K 

I* 


((Following  is  a  list  of  recommended  data  cards) 
ALLON 

I  10 

I  5  7  12  13  14  15  17  29  30  39  46  47 

SPECL 

I  3 

|  TS1  FTS  TINT 

END 


Figure  6-8.  JCL  to  RUN  MUDRP 
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Table  6-5  shows  all  the  MUDRP  file  codes  and  their  corresponding  reports. 
6.3  Outputs 

In  this  section,  a  simple  explanation  of  how  each  report  was  derived  from 
the  data  is  given.  Subsection  6.1  discussed  how  the  ranges  and  other 
options  of  each  report  may  be  modified  to  fit  an  individual  installation. 
While  this  section  will  provide  some  insight  as  to  how  an  analyst  should 
proceed  to  review  all  the  reports  produced  by  the  MUDRP,  section  14 
provides  a  step-by-step  approach  as  to  how  a  memory  analysis  might  be 
conducted. 

Immediately  prior  to  the  output  of  the  histograms,  the  user  will  find  a 
printout  containing  processing  information.  Included  in  this  information 
is  the  following: 

o  Printout  of  ail  input  options  selected  by  user 

o  Indication  of  multireel  tapes  that  are  being  requested  and  have 
been  mounted 

o  Indication  of  the  monitors  that  were  active  during  data  collection 

o  Error  messages  -  all  error  messages  are  either  self-explanatory 
or  else  followed  by  the  words  "For  Information  Only."  The  latter 
messages  are  used  by  CCTC  for  future  enhancements  and  as  such  can 
be  ignored  by  the  user. 

o  If  the  time  frame  option  was  used,  and  indication  of  when  the 
various  time  frames  were  reached. 

6.3.1  MUM  Title  Page.  The  Memory  Utilization  Monitor  (MUM)  title  page 
contains  a  summary  of  the  systems  configuration  and  activity  over  the 
measurement  period  (see  figure  6-9).  It  displays  the  time  the  monitor  was 
initiated  and  terminated,  as  well  as  identifying  the  system  which  was 
monitored  and  the  tape  number(s)  containing  the  data.  The  configuration 
information  is  augmented  by  the  amount  of  memory  dedicated  to  the 
operating  system  itself,  including  that  used  by  the  memory  allocation 
program.  These  figures  will  give  the  user  a  good  idea  of  how  much  hard 
l  core  space  remains  which  could  be  used  for  SSA  module  hard  core  loading. 

If  SSA  cache  is  also  configured,  the  amount  of  memory  being  used  for  this 
|  feature  is  also  listed.  The  version  number  should  be  09-83  CHG-7* 

Immediately  following  is  a  summary  of  the  work  processed  over  the 
measurement  period.  The  first  set  of  lines  provides  information 
concerning  the  overhead  generated  by  the  actual  data  collection.  The 
monitor  name  is  given,  its  CPU  time  in  seconds,  an  its  overhead  as  a 
function  of  total  processor  power.  The  GMF  executive  overhead  is 
separated  from  the  actual  monitors  and  is  listed  as  "EXEC".  The  monitor 
"NAME"  is  an  area  of  code  within  the  Mass  Store  Monitor  and  even  though 
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Table  6-5*  Vila  Code  for  MUM  Reports 
(Part  1  of  2) 

20  Activity  Resource  Report,  Special  Job  Reports 

21  ID ENT  Report 

22  Special  Job  Report  (temporary  file) 

23  Special  Job  Report  (temporary  file) 

24  Urgency  Over  Time  Report  (temporary  file) 

26  Zero  Urgency  Job  Report  (temporary  file) 

27  Activity  Abort  Report 

31  Plot  1  -  (see  table  6-1  for  Plot  Definition)  (temporary  file) 

32  Plot  2  -  (see  table  6-1  for  Plot  Definition)  (temporary  file) 

33  Plot  3  -  (see  table  6-1  for  Plot  Definition)  (temporary  file) 

34  Excessive  Resource  Report 

35  Plot  4  -  (see  table  6-1  for  Plot  Definition)  (temporary  file) 

36  Used  for  outputting  all  plots 

37  Used  for  outputting  Out  of  Core  Report,  Memory  Map,  and 
Peripheral  Allocator  Report 

42  Histograms,  System  Program  Usage  Report,  Memory  Statistics 

Report,  Distribution  of  Urgency  Over  Time  Report,  Zero  Urgency 
Job  Report 

45  Out  of  Core  Report  (temporary  file) 

51  Memory  Map  Report  with  one  file  required  for  each  12SK  Memory 
configured  (temporary  file) 

52  Memory  Map  Report  with  one  file  required  for  each  128K  Memory 
configured. (temporary  file) 

53  Memory  Map  Report  vith  one  file  required  for  each  128K  Memory 
configured  (temporary  file) 

54  Memory  Map  Report  vith  one  file  required  for  each  128K  Memory 
configured  (temporary  file) 
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listed  separately  it  is  also  included  under  the  nonitor  "MSM".  The 
Monitor  "MS"  is  also  an  area  of  code  within  the  Mass  Store  Monitor,  but 
in  this  ease  it  has  not  been  Included  under  the  aonitor  "MSM".  These  two 
special  areas  of  code,  within  subroutine  T7  (connect  trace  processing), 
are  considered  to  be  high  usage  areas  and  as  such  consune  significant 
processing  resources*  In  order  to  determine  the  true  overhead  of  these 
areas,  so  that  future  oode  optimisation  can  be  considered,  these  areas  are 
being  reported  separately. 

Monitor  "CM"  in  this  report  describes  the  processor  overhead  of  subroutine 
T4  (terminate  processing)  and  subroutine  T22  (start  I/O  processing). 
Monitor  "MSM”  in  this  report  describes  the  processor  overhead  of 
subroutine  T7  (connect  processing).  Therefore,  if  the  Channel  Monitor  was 
active,  but  the  Mass  Store  Monitor  was  not,  this  report  will  still  list 
both  "CM”  and  "MSM"  as  contributing  to  the  processor  overhead.  The  total 
Channel  Monitor  overhead  will  be  found  by  adding  the  overhead  of  the  "CM" 
nonitor,  to  the  overhead  of  the  "MSM"  monitor,  to  the  overhead  of  the 
"IMS"  monitor. 

If  both  the  Channel  Monitor  and  Mass  Store  Monitor  were  active,  then  the 
combined  overhead  of  both  monitors  can  be  found  as  the  sum  of  "MSM"  ♦  "CM" 
♦  "IMS". 

For  purposes  of  this  report,  %  overhead  is  computed  as 

(CMJT1MK  used  by  nonitor) _ 

(Total  Elapsed  Time)  z  (number  of  Processors) 

Following  this  are  several  lines  describing  the  work  performed  during  the 
monitoring  session.  These  lines  are  self-explanatory. 

If  a  termination  record  is  not  processed,  either  because  the  aonitor 
aborted  before  a  termination  record  could  be  written  or  else  time  frames 
were  used,  the  lines  describing  GMC  overhead  will  not  be  printed. 

The  number  of  times  a  processor  vent  idle  is  derived  from  the  idle 
processor  traces  captured  by  the  IDLBf,  with  the  percentage  of  processor 
idle  also  being  gathered  by  the  collection  of  idle  state  information. 

This  is  shown  system-wide  (l.e..  for  all  the  central  processors  and  then 
individually  for  each  processor).  This  information  will  not  be  present  if 
the  IDLBf  was  not  active  or  if  its  output  reports  have  been  disabled  by  a 
data  card  option  (see  figure  6-10). 

The  number  of  memory  allocator  calls,  as  counted  by  the  monitor,  is 
shown.  This  much  less  than  the  number  of  calls  to  the  multitude  off  SSA 
modules  used  by  the  Core  Allocator  and  consists  only  of  those  that  may 
have  altered  the  memory  state  of  the  system.  The  second  figure  shows  how 
many  times  a  memory  state  change  might  have  taken  place  and  did  not.  This 
could  be  caused  by  no  allocation  being  possible  or  by  s  call  to  the 
allocator  pertaining  to  a  matter  other  than  allocation  (i.e.,  a  console 
message). 
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The  next  line  printed  out  Is  the  Total  CPU  and  I/O  tinea  In  seconds  and 
the  ratio  of  CPU  to  I/O  tine.  This  figure  gives  the  user  an  Idea  of 
whether  the  workload  processed  by  the  systea  is  I/O  or  CIV  doalnant.  It 
should  be  noted  that  these  nuabers  are  the  aaount  of  CPU  and  I/O  tiae 
generated  during  the  aeasureaent  period. 


The  next  two  lines  give  an  indication  of  whether  the  systea  has  a  surplus 
or  shortfall  of  aeaory.  The  weighted  figure  is  calculated  by  using  the 
following  fornula: 


W 


u 

L 


( 


aeaory 
l  available 


deaand  for 
aeaory 


) 


TOTAL  TIME 


Where  1  ■  calls  to  the  core  allocator 

T  T  -  length  of  tiae  over  which  aeaory  availability  was  in  this 
state. 


If  W  coaes  out  positive,  there  is  a  core  surplus  and  if  W  comes  out 
negative,  there  is  a  core  shortfall.  In  the  firBt  line,  the  demand  for 
aeaory  la  taken  only  from  the  Core  Allocator's  queue.  In  the  second  line 
the  demand  for  memory  is  taken  from  the  demand  in  both  the  Core  Allocator 
and  Prlpheral  Allocator  queues.  The  Peripheral  Allocator's  queue  consists 
of  the  memory  demand  that  is  currently  being  processed  by  the  Peripheral 
Allocator  and  has  not  yet  reached  the  Core  Allocator.  The  Peripheral 
Alloctor  will  stop  transferring  jobs  to  the  Core  Allocator  when  the  Core 
Allocator's  queue  reaches  a  predefined  length.  This  second  figure 
presents  a  truer  picture  of  aeaory  availability.  Jobe  from  the  Peripheral 
Allocator  are  only  included  if  they  have  been  completely  processed  by  the 
Peripheral  Allocator.  These  figures  present  a  good  first  indication  of 
whether  or  not  availability  of  memory  is  a  systea  constraint.  In 
calculating  deaand,  a  Job  is  only  included  if  it  does  not  have  a  zero 
urgency.  Any  activity  with  a  zero  urgency  will  not  be  considered  to  have 
a  core  deaand  unless  the  activity  is  in  a  loading  (activity  0)  or 
terminating  status. 

6.3.2  System  Program  Usage.  The  report  immediately  following  the  title 
page  provides  an  overview  of  the  system  program  load  on  the  memory 
subsystem.  The  data  presented  consists  of  the  following: 

o  Total  Memory  Time  for  This  Systea  Program  *  100 
Memory  Time  for  all  Programs 

This  figure  would  indicate  what  percentage  of  the  total  memory  time  was 
used  by  this  program. 

o  Percentage  of  the  Elapsed  Time  in  Memory 

o  Total  Size  Time  Product  for  This  System  Program  *  100 
Total  Size  Time  Product  for  all  Programs 
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|  This  figure  would  indicate  what  percentage  of  the  total  size  time  product 
wae  uaed  by  this  program.  The  size-time  product  of  a  job  is  an  attempt  to 
determine  the  memory  effect  of  a  job  based  not  only  on  its  size,  but  on 

I  the  length  of  time  that  it  runs.  A  20K  job  that  runs  for  three  hours 
might  be  more  detrimental  to  a  system  than  a  60K  job  that  runs  ten  minutes. 

o  Total  Size  Time  Product  for  This  System  Program  *  100 
Total  Size  Time  Produce  Available  to  System 

Where  Total  Size  Time  Produce  Available  to  System  *  (The  Elapsed  Run  Time) 
*  (Total  Allocatable  Memory) 

The  next  two  figures  are  weighted  memory  sizes  for  this  program.  The 
first  figure  is  the  weighted  memory  size  of  this  program  while  it  is  in 
memory.  Therefore,  if  TSS  was  in  memory  during  three  different  time 
periods  for  1/2  hour,  3/4  hour,  and  1  hour,  and  during  these  periods  its 
memory  size  was  40K,  100K,  180K  respectively,  its  weighted  in  memory  size 
would  be  calculated  as  follows: 

Weighted  (IN)  =  (4Q)*(.5)+(100)#(*75)+(ieo)»(l) 

2.25 

=  275  =  122K 
2.25 

Had  the  calculation  not  been  weighted  by  time,  the  average  size  of  TSS 
would  have  been: 


(4Q)-(100)*(180)  =  73K 
3 

In  the  above  calculation,  the  report  would  be  stating  that  the  amount  of 
|  memory  being  taken  away  from  the  system,  by  TSS,  was  122JC.  However,  what 
if  TSS  was  swapped  for  50  percent  of  the  total  elapsed  time?  Then  TSS 
really  did  not  take  122K  from  the  system,  but  rather  only  61K.  The  second 
|  weighted  figure  takes  into  account  the  total  time  the  program  was  actually 
in  memory. 

The  final  figure  is  the  number  of  times  this  program  was  swapped. 

In  addition  to  the  standard  system  programs,  any  jobs  requested  by  the 
user,  to  be  considered  as  system  jobs,  also  appear  in  this  report.  In 
figure  6-11  we  see  6  user-requested  jobs  appearing  on  the  report.  The 
user  had  actually  requested  nine  jobs  to  be  considered  as  system  jobs,  but 
three  of  those  jobs  never  appeared.  In  a  system  using  multicopies  of  TSS 
only  TS1  (prog  #5)  will  appear  in  this  report.  Other  copies  of  TSS  must 
be  requested  by  user  input  option  "MASTER".  In  a  WWMCCS  system,  a  program 
is  considered  to  be  a  system  program  if  it  has  a  program  number  less  than 
decimal  14.  Commercial  users  should  use  the  FSTSLV  option  to  change  the 
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THE  SYSTEM  PROGRAM  USAGE  OF  MEMORY  WAS I 
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value  of  14  to  an  8.  In  addition,  the  CMC  Monitor,  $HEALS,  VIDEO  and 
$TRAX  are  all  considered  to  be  system  programs. 


6.3*3  MUM  Reports.  Ihe  following  paragraphs  describe  the  reports  output 
by  MUM. 

Report  numbers  1-50  are  all  presented  in  histogram  format  (see  figure 
6-12).  At  the  top  of  the  report,  the  system  name,  as  well  as  the  time  and 
date  of  data  collection,  are  given.  This  is  followed  by  the  title  line  of 
the  histogram.  Column  number  1  indicates  the  number  of  occurrences  of  a 
given  event,  with  column  number  5  describing  the  event.  In  figure  6-12, 
we  find  that  229  times  there  were  0  user  activities  in  memory,  while  2899 
times  there  were  5  activities  in  memory.  Column  number  2  is  simply  a 
running  total  of  column  number  1.  Therefore,  the  second  line  in  column 
number  2  (2008)  is  merely  a  running  total  of  the  first  two  lines  of  column 
number  1  (229  +  1779).  The  fourth  column  is  the  percentage  of  all 
activities  which  will  fall  into  that  line  of  the  report.  For  example, 

4063  entries  out  of  a  total  23410  entries  indicate  2  activities  in 
memory.  This  results  in  a  17.35  percentage  figure.  This  means  that 
17.35/S  of  all  measurements  (4063/23410)  showed  2  activities  being  in 
memory.  Column  number  3  is  simply  a  running  total  of  column  number  4«  It 
presents  the  percentage  of  measurements  which  will  fall  into  a  given  line, 
or  earlier  line.  For  example,  25*935?  of  all  measurement  showed  the  number 
of  activities  in  memory  to  be  2  or  less.  There  is  a  graphic  display  of 
these  measurements  presented  to  the  right  of  the  fifth  column.  At  the 
bottom  of  the  report,  summary  information  is  provided  and  is  calculated  in 
the  standard  statistical  manner. 

In  figure  6-13,  we  see  a  similar  histogram  report.  As  displayed  by  column 
5,  we  find  that  each  line  of  the  histogram  represents  a  range  of  values, 
with  an  interval  size  of  200.  This  interval  size  can  be  modified  by  the 
user.  The  lowest  value  in  this  histogram  is  0  (modifiable  by  the  user) 
and  the  size  of  the  histogram  defaults  to  45  lines  (also  modifiable  by  the 
user).  Actually,  for  this  run,  the  lowest  value  recorded  was  42.  Since 
we  can  output  only  45  lines  and  each  line  represents  a  range  of  values  of 
200,  the  largest  value  that  could  be  reported  would  be  9,000  (200  x  45). 

If  a  measurement  falls  outside  this  maximum  value,  it  is  reported  as  an 
out-of-range  value.  In  figure  6-13,  we  find  the  21  measurements  exceeded 
9,000.  The  average  of  these  21  measurements  was  20188.48.  The  first  line 
of  the  summary  includes  all  measurements  that  were  taken.  Therefore,  21 
|  out  of  79  entries  (26  percent)  of  all  measurements  were  out  of  range.  The 
average  of  all  measurements  taken  was  5953*62,  while  the  average  of  the 
in-range  masurements,  (all  out  of  range  values  are  eliminated)  was  799*62. 

6. 3.3*1  Report  1  -  Memory  Demand  Sizes  of  New  Activities  in  IK  Word 
Blocks.  This  report  shows  the  demand  size,  in  lK-word  blocks,  of  each 
individual  user  activity  as  it  was  first  seen  by  the  memory  allocator. 

The  demand  sizes  are  presented  to  the  allocator  by  the  Peripheral 
Allocator.  This  is  a  good  measure  of  the  memory  demand  load  of  a  systems 
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Figure  6-13.  Out-of-Range  Hlatograa 


operation  and  can  be  used  to  set  System  Scheduler  classes  to  correctly 

I  balance  the  load  cross  varying  memory  size  jobs.  This  report  shows  the 
percentage  of  activites  which  had  a  particular  memory  size.  For  this 
report,  an  entry  is  made  for  each  new  user  activity  demand  at  each 
allocator  call.  See  Report  10  for  an  explanation  of  user  vs.  system 
activity. 

6. 3* 3« 2  Report  2  -  The  Memory  Demand  Size  of  All  Demand  Types.  This 
report  contains  the  information  in  Report  1,  with  the  addition  of  all 
other  individual  demand  types.  These  include  activities  that  are  swapped 
or  involved  in  a  memory  compaction  procedure.  This  report  should  be 
similar  to  Report  1,  unless  a  great  amount  of  GEMORE,  GERLEC,  or  swap 
operations  are  performed  by  the  users  load.  This  would  alter  the  memory 
size  demands  from  that  seen  by  the  allocator  at  the  initial  request.  For 
this  report,  an  entry  is  made  for  each  activity  with  an  outstanding  demand 
for  each  allocator  call.  Activities  with  an  urgency  of  0  are  not  counted. 

6.3*3 *3  Report  3  -  The  Potential  Time-Weighted  Memory  Demand.  This 
report  shows  the  total  memory  required  if  all  jobs  currently  in  the  system 
were  to  have  their  memory  demand  satisfied  at  one  time.  The  data  shown  in 
this  report  is  the  sum  of  all  jobs  currently  in  memory,  plus  all  jobs 
currently  swapped,  plus  all  jobs  waiting  for  their  original  allocation  of 
memory,  plus  all  jobs  currently  in  the  system  with  a  zero  urgency.  This 
report  is  an  indication  of  the  worst  case  demand  that  may  be  potentially 
placed  upon  the  system.  An  entry  is  made  to  this  report  at  each  allocator 
|  call.  See  report  16  for  an  explanation  of  time  weighting. 

6. 3*3.4  Report  4  -  The  Demand  That  Was  Outstanding  When  a  Processor  Went 
Idle.  This  report  shows  the  sum  of  demand  for  all  activities  in  the 
system  including  outstanding  GEMOREs.  It  is  a  distribution  of  memory 
demand  that  is  not  satisfied,  across  the  measurement  session.  It  should 
be  remembered  that  all  data  is  collected  at  the  Core  Allocator  and  does 
not  represent  the  full  system  load.  Portions  of  the  load  may  be  held  in 
the  System  Scheduler  and  the  Peripheral  Allocator.  Activities  with  an 
urgency  of  0  are  not  counted.  An  entry  is  made  only  if  a  processor  has 
gone  idle  since  the  last  allocator  call.  If  a  large  demand  should  be 
outstanding  during  processor  idleness,  a  system  bottleneck  may  be 
present.  In  this  case,  memory  is  probably  fully  utilized  (i.e.,  demand 
cannot  be  satisfied),  but  the  activities  that  are  occupying  memory  are  not 
using  the  processor,  (i.e.,  a  processor  has  gone  idle).  This  is  a  good 
sign  of  an  I/O  backlog.  IDLEM  data  is  used  to  produce  this  report.  If 
the  Idle  Monitor  was  not  active,  this  report  will  not  be  produced. 

6. 3*3*5  Report  5  -  The  Total  Amount  of  Available  Memory.  The  total 
amount  of  available  memory  is  a  key  indicator  of  the  system  memory 
utilization.  If  this  amount  is  continually  low,  the  memory  is  being  fully 
utilized  and  possibly  in  need  of  expansion.  A  continually  high  amount  may 
indicate  another  system  bottleneck  or  an  excess  of  memory.  This  report, 
when  used  in  conjunction  with  Reports  3,  4,  and  6  should  give  a  good 
first-level  indication  of  system  memory  utilization.  It  should  be  noted 
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that  the  availability  shown  here  exists  in  all  quadrants.  The 
availability  is  the  sum  of  any  and  all  "holes"  in  the  system  and  does  not 
mean  that  this  memory  is  contiguously  available. 

Any  activity  with  an  urgency  of  0  that  is  currently  in  memory  will  have 
Its  memory  size  included  in  this  availability  figure.  The  reason  for  this 
is  that  if  memory  becomes  a  constraint,  these  activities  can  be  swapped 
and  their  memory  will  become  available  for  use. 

For  this  report,  an  entry  is  made  for  each  allocator  call.  For  most 
analyses,  this  report  will  not  be  used  since  report  8  provides  a  more 
statistically  accurate  representation  of  this  data. 

6. 3*3. 6  Report  6  -  The  Memory  Available  When  a  Processor  Went  Idle.  The 
previous  report  is  repeated  with  the  additional  restraint  that  a  processor 
has  gone  idle  since  the  last  allocator  call.  This  aids  in  identifying 
either  a  bottleneck  or  a  lightly  loaded  system. 

For  this  report,  an  entry  is  made  at  each  allocator  call  that  had  a 
processor  go  idle  since  the  last  allocator  call.  IDLEM  data  is  used  to 
produce  this  report.  This  report  will  not  be  produced  if  IDLEH  was  not 
active  or  the  IDLHJf  Reports  have  been  disabled  via  user  input  command. 

6. 3*3.7  Report  7  -  The  Time- Corrected  Total  Demand  Outstanding.  See 
report  16  for  an  explanation  of  time  correction.  The  time-corrected  total 
demand  is  the  sum  of  all  requests  for  memory  known  to  the  allocator  as 
indicated  in  report  4-  Activities  with  urgency  0  are  not  counted. 

6.3. 3.8  Report  8  -  The  Time-Corrected  Memory  Available.  See  report  16 

for  an  explanation  of  time  correction.  This  report  reflects  the 

time -corrected  amount  of  total  memory  available  as  indicated  in  report  5* 

6. 3*3«9  Report  9  -  The  Number  of  Activities  Waiting  for  Memory  in 
Allocator  Queue.  This  report  identifies  the  depth  of  the  allocator  demand 
queue  and  includes  all  activites  that  are  waiting  for  memory  allocation. 
Activities  with  a  0  urgency  are  not  considered  as  waiting  for  memory. 

This  report  aids  in  determining  if  too  many  or  too  few  activities  are 
getting  to  the  Core  Allocator  from  the  Peripheral  Allocator.  For  this 
report,  an  entry  is  made  at  each  allocator  call.  For  most  analyses,  this 
report  will  not  be  used  since  report  11  provides  a  more  statistically 
accurate  representation  of  this  data. 

6.3*3*10  Report  10  -  The  Number  of  User  Activities  Waiting  Memory  in 
Allocator  Queue.  This  report  is  the  same  as  report  9  except  that  it  only 
counts  those  activites  of  a  slave  job  as  identified  by  their  program 
number  (program  number  14  or  greater).  In  order  to  charge  this  program 
number  test,  the  user  should  see  Input  Action  FSTSLV.  In  addition,  the 
user  may  specify  up  to  ten  additional  programs  that  he  wants  considered  as 
system  programs,  even  though  their  program  number  exceeds  14.  The  user 
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o'..wdld  8«e  Input  Action  MASTER  in  order  to  select  this  option.  This 
odnnrt  indicates  the  “user"  work  waiting  allocation.  For  this  report,  an 
entry  is  made  on  each  allocator  call.  For  most  analyses,  this  report  will 
cot  be  used  since  report  12  provides  a  more  statistically  accurate 
representation  of  this  data. 

..'.'.11  Report  11  -  The  Time-Corrected  Number  of  Activities  Waiting 
Memory.  See  report  16  for  an  explanation  of  time  correction.  This  report 
jr.^i-ates  the  time-corrected  number  of  acti vltes  waiting  memory  as  in 
report  9* 

C . y • 1 . 12  Report  12  -  The  Time-Corrected  Number  of  User  Activities  Waiting 
Memory .  See  report  16  for  an  explanation  of  time  correction.  This  report 
■  «tes  the  time-corrected  number  of  user  Jobs  waiting  memory  in  the 
allocators  queue  as  in  report  10.  See  report  10  for  additional  user 
options. 

6.3.3.13  Report  13  -  The  Humber  of  Activities  Waiting  Memory  When  a 
.-iod=3sor  Went  Idle.  Report  9  is  the  basis  f^r  this  report,  with  the 
additional  criteria  that  a  processor  must  have  gone  idle  since  the  last 
BAxocator  call.  An  entry  is  made  for  each  allocation  where  a  processor 
has  gone  idle  since  the  last  call.  ISLFM  data  is  used  to  produce  this 
report.  This  report  will  not  be  produced  if  IDLFK  is  not  active  or  the 

reports  were  disabled  via  user  input  commands. 

6.3.3.14  Report  14  -  The  Wumber  of  Activitea  Residing  in  Memory.  This 
renort  represents  the  number  of  activities  allocated  memory.  It  indicates 
the  multiprogramming  depth  the  system  is  obtaining.  It  is  probably  an 
upper  level  since  an  activity  is  allocated  memory  prior  to  and  past  actual 

Any  activity  in  memory,  with  a  0  urgency,  is  not  considered  as 
residing  in  memory.  For  this  report,  an  entry  is  made  for  each  allocator 

_ _ _  For  most  analyses,  this  report  will  not  be  used  since  report  16 

pr:*ies  a  more  statistically  accurate  representation  of  this  data. 

'  ~  'T.15  Report  15  -  The  Humber  of  User  Activities  in  Memory.  The 
activites  shown  in  this  report  are  those  that  are  in  memory  and  have  a 
program  number  greater  than  or  equal  to  14.  These  are  user  programs.  For 
|  this  report,  an  entry  is  made  at  each  allocator  call.  As  explained  in 
j  report  14,  any  activity  which  has  an  urgency  of  zero  will  not  be  counted 
|  as  oeing  in  memory.  See  report  10  for  additional  user  options  in  defining 
jobs  and  user  jobs.  For  most  analyses,  this  report  will  not  be 
used  since  report  17  provides  a  more  statistically  accurate  representation 
.  ~  _ 3  data. 

c  ■*  T.16  Report  16  -  The  Time- Corrected  Number  of  Activities  in  Memory. 

-eport  presents  the  same  information  as  in  report  14.  The  number  of 
entries  at  each  allocator  call  is  determined  by  the  time  since  the  last 
itor  call.  The  result  is  a  simulation  of  a  uniform  sample  rate  of 
<ator  calls.  Therefore,  the  noncorrected  reports  display  the 
..ibutions  as  seen  by  the  allocator  itself.  The  time-corrected  reports 
p-r.^pnt  the  time  weighted  distributions.  As  an  example  assume  that  three 
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IDLOf  data  is  used  to  produce  these  reports.  These  reports  will  not  be 
produced  if  IDLEM  was  not  active  or  if  the  IDL®  reports  have  been 
disabled  by  user  input  command. 

6.3.3.39  Report  30  -  Original  Allocation  Time  for  User  Memory  in  I/O 
Second.  This  report  gives  the  time  each  user  activty  waited  for  its 
original  allocation  of  memory.  See  report  10  for  an  explanation  of  user 
and  system  activities. 

6.3*3.40  Report  31  -  The  Time-Corrected  Percent  of  Assigned  Memory  Used. 
This  report  gives  the  time-corrected  percentage  of  slave  memory  used  over 
the  monitoring  period.  Any  memory  being  utilized  by  jobs  with  zero 
urgency  will  not  be  included  in  the  memory-used  figure  for  this  report. 

See  report  16  for  a  definition  of  Time  Correction. 

6.3.4  Activity  Resource  Usage  Report.  For  each  activity  known  to  the 
monitor,  a  detailed  Resource  Usage  Report  is  made  upon  termination  of  the 
|  activity.  The  report  is  ordered  by  termination  time  sequence,  and  the 
resource  usage  is  that  known  to  the  system  at  the  last  allocator  call 
(refer  to  figure  6-15). 

Each  activity  is  displayed  via  che  SNUMB  and  activity  number  followed  by 
the  CP  an  I/O  charge  times  expressed  in  milliseconds.  This  is  the  CP  and 
|  I/O  times  generated  during  the  monitoring  session.  The  size-time  product 
is  the  total  K  words  times  the  microseconds  of  allocation  time,  which 
gives  a  better  expression  for  the  memory  used  by  the  job  than  the  size  of 
the  job.  The  minimum  and  maximum  core  requirements  of  the  job  are  then 
shown,  including  the  activity  Slave  Service  Areas  (SSAs)  as  well  as  slave 
size. 

The  elapsed  time,  in  hours,  an  activity  was  known  to  the  allocator  is 
|  followed  by  the  number  of  times  the  job  size  changed  for  any  reason.  The 
wasted  core  column  is  calculated  from  the  job  Slave  Prefix  Area  (SPA)  word 
37  octal.  This  is  filled  by  the  System  Loader  and  may  not  be  valid  for 
all  job  types  (i.e.,  an  H*  file  is  not  loaded  in  the  normal  system  load 
f manner).  This  column  is  shown  in  order  to  help  locate  users  that  do  not 
have  the  ^LIMITS  card  set  correctly  for  the  memory  being  used.  If  the 
user  appears  to  be  requesting  excessive  core  on  his  ^LIMITS  card,  he  may 
be  using  this  extra  space  as  a  spare  buffer  area.  If  this  figure  shows  an 
excessive  misuse  of  the  ^LIMITS  card,  the  user  should  be  contacted  and 
questioned. 

The  next  two  columns  provide  a  count  of  the  total  number  of  swaps  and 
moves  incurred  by  the  activity.  The  final  columns  of  the  entry  gives 
memory  allocation  time,  wait  time,  swap  time,  memory  time,  and  GEWAKE 
time,  all  in  tenths  of  a  second  for  each  activity.  An  entry  will  be  made 
in  this  report  for  every  activity  of  a  job,  when  the  activity  completes. 
Upon  termination  of  the  monitor,  the  resource  usage  of  all  actiites  known 
to  the  allocator  will  be  reported,  including  system  jobs.  This  output 
follows  a  full  line  of  asterisks  to  denote  that  no  termination  records 
were  found  for  these  activities. 
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COLLECTED  ON  SYSTEM  IMCC2  ON  80-12-15  AT  TIMS  12:39 

ACTIVITY  RESOURCE  USAGE  REPORT  -  REPORTED  PER  ACT  SIZE  ELAPSED  SIZE  WASTED  TIME(.l  SEC)  SPENT  IN 

SNUM E-ACT  CPU  6  10  TIME  (MS)  SIze-TIMe  PROD  MIN  MAX  TIME  CHANCE  CORE  SWAPS  MOVES  ALLOC  SWAP  MEMORY  CEWAKE 
7332T-  0  44  108  1.5829E  08  31  31  0.003  0  0  0  0  51  0  SS)  0 


6-42 


Figure  6-15.  Activity  Resource  Usage  Report 
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Figure  6-17.  (Part  3  of  3) 


(16K)  -  including  SSAs 

(12K)  -  urgency  discarded 

(7K)  -  activity  discarded 

(5K)  -  asterisks  discarded 

(4K)  -  no  identification 

As  can  be  seen  in  figure  6-17*  Part  1,  every  line  of  the  figure  has  a  line 
number  ranging  from  1-50.  In  addition,  there  is  a  page  number  in  the 
upper  right  corner.  When  the  user  wants  to  match  this  picture  of  the 
first  half  of  the  first  quadrant  with  its  corresponding  half  of  some  other 
quadrant,  the  following  steps  should  be  followed: 

1-  Match  page  numbers  (see  figure  6-17,  Part  2,  Part  3) 

2-  Match  the  line  numbers  from  identical  page  numbers. 

Two  special  names  can  appear  in  the  memory  map.  If  SSA  cache  memory  is 
configured  the  following  letters  will  be  found  in  the  map,  depending  on 
the  size  of  the  SSA  cache  memory: 


*SSA 

CACHE** 

12K 

*SSA 

CACHE 

10K 

*SSA 

CAC 

8K 

SSA 

C 

5K  (see  figure  6-17,  Part  3) 

If  memory  has  been  released  from  the  system,  then  the  letters  *-RELEASED* 

|  will  appear  in  the  map.  This  will  be  repeated  depending  upon  how  much 
memory  has  been  released. 

6.3>7  Demand  List  Report.  The  Demand  List  Report  shows  the  memory  demand 
outstanding  for  each  memory  state  displayed  on  the  memory  map.  The 
correlation  is  made  using  the  same  line  numbers  as  the  half  quadrants  of 
the  maps  themselves  (refer  to  figure  6-18). 

The  Demand  List  Report  shows  the  total  memory  available,  the  number  of 
jobs  waiting  memory,  the  demand  request  sizes  for  each  job  waiting 
memory.  The  memory  available  is  the  sum  of  all  holes  in  memory. 

6.3.8  Activity  Abort  Request.  This  report  is  directly  related  to  the 
Activity  Resource  Usage  Report.  This  report  is  produced  whenever  the 
Activity  Report  is  produced.  For  every  activity  that  aborts  during  the 
monitoring  session,  an  entry  is  made  to  this  report.  The  entry  gives  the 
SNUMB,  Activity  Number,  Abort  Code,  CPU  Time,  Run  Hours,  USERID,  and  IDENT 
for  the  activity. 


•-J0B01-XXXU05-* 
J  *J0B02-m« 
*J0B03* 

J0B04 

**** 


The  Abort  Code  is  either  an  octal  number  or  an  alphanumeric  value.  The 
meaning  of  these  codes  can  be  found  an  Appendix  A  of  Honeywell  Manual  DD19 
(GCOS). 


statement  that  the  ^LIMITS  card  appears  to  be  requesting  more  memory  than 
is  actually  required  by  this  job.  The  user  should  be  questioned  in  order 
to  determine  if  this  is  in  fact  true.  In  the  Honeywell  System,  a  user 
will  receive  whatever  amount  of  memory  requested  on  the  ^LIMITS  card, 
whether  or  not  the  amount  of  memory  is  actually  needed.  The  Ratio  column 
shows  the  ratio  of  the  total  elapsed  time  for  an  activity  divided  by  the 
total  memory  time  for  the  activity.  This  value  gives  an  indication  of  the 
activity  lengthening  factor;  i.e. ,  how  run  time  is  affected  by  resource 
contention.  For  those  activities  using  excessive  memory,  the  report  also 
indicates,  under  the  MEM  MIN  column,  the  amount  of  time  the  activity  was 
in  memory.  The  value  being  used  for  the  urgency  check  is  the  average 
urgency  recorded  for  the  activity  and  not  the  maximum  urgency  of  the 
activity.  The  default  values  for  an  entry  being  made  to  this  report  are 
listed  in  table  6-4.  These  values  can  be  changed  via  a  previously 
described  input  option.  This  report  will  be  produced  whenever  the 
Activity  Resource  Report  is  produced  and  will  be  turned  off  whenever  the 
Activity  Resource  Report  is  off  (see  figure  6-21). 

6.3.11  Allocation  Status  Report.  This  report  will  track  an  activity  as 

I  it  proceeds  through  different  phases  of  Allocation.  The  report  lists  the 
SNUMB- Activity  #,  amount  of  memory  the  activity  required,  its  current 
status,  the  time  it  entered  that  phase  of  allocation,  the  time  it 
completed  that  phase  of  allocation,  the  total  time  spent  in  a  given  phase 
of  allocation,  the  device  type,  if  any,  that  it  was  waiting  for,  and  the  . 
number  of  devices,  if  any,  the  activity  was  waiting  for.  Due  to  the 
manner  in  which  data  is  collected  for  this  report,  it  is  possible  that 
certain  phases  of  allocation  will  be  missed,  especially  if  that  phase  of 
allocation  occurs  within  a  short  time  span.  This  report  will  give  a  good 
indication  of  how  long  it  is  taking  activities  to  pass  through  the  various 
allocation  phases  prior  to  core  allocation.  Following  is  a  list  of  the 
more  common  phases  of  allocation  and  their  meanings: 

New  Act  -  Activity  has  just  entered  the  Peripheral  Allocator 
Wait  Media  -  Activity  is  waiting  for  a  device 
|  Wait  Mnt  -  Activity  is  waiting  for  a  disk  pack  or  tape  to  be  mounted 
Core  Queue  Full  -  Activity  has  been  completely  processed  and  is 

waiting  for  the  Peripheral  Allocator  to  send  the 
job  to  the  core  allocator 

Alloc  Done  -  Activity  has  been  sent  to  core  allocator.  For  this 

case  the  stop  time  and  total  time  columns  have  no  real 
meaning.  These  columns  simple  are  reporting  the  amount 
of  time  it  took  the  monitor  to  realize  that  the  activity 
had  reached  the  core  allocator 

LIMBO  -  Activity  is  in  Limbo  and  has  not  even  been  granted  permission 
to  run 

HOLD  -  Activity  is  in  Hold  and  has  not  even  been  given  permission 
to  run 

SCHED  -  Activity  was  in  one  of  the  System  Scheduler  queues. 
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Only  activities  found  to  be  in  a  state  for  more  than  600  seconds  will  be 
i-vywrted*  This  limit  can  be  changed  by  using  the  PALC  input  option*  See 
*'<;nre  6-22  for  a  saaple  of  this  report. 

'■  12  Plot  Reports.  Pour  different  plot  reports  are  produced  by  the 
date  reduction  program.  All  plots  are  produced  under  10-minute  intervals. 
-L.o  the  interval  can  be  modified  by  the  user.  At  eveiy  allocator  call, 
tl.w  /arioua  parameters  to  the  plots  are  accumulated  and  every  10  minutes, 
the  accumulated  parameters  are  averaged  and  an  average  value  is  output 
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7.2  Data  Collection  Methodology 

The  MSM  In  the  General  Monitor  Collector  processes  GCOS  trace  types  7  and 
15  and  collects  Information  to  monitor  the  usage  of  the  entire  disk 
subsystem.  The  Information  collected  on  the  occurrence  of  the  above  traces 
enables  the  MSMDRP  to  Identify  the  activity  Issuing  the  I/O  request,  the 
file  being  accessed,  the  disk  pack  upon  which  the  file  Is  located,  arm 
movement  required  In  order  to  accomplish  the  requested  file  accessing,  and 
the  type  of  accessing  being  requested;  e.g.,  read,  write,  write  verify,  etc. 


If  the  system  being  monitored  by  the  MSM  Is  configured  with  SSA  Cache  Core, 
the  MSM  will  create  two  direct  transfer  traces  {types  73  and  76)  In  order 
to  collect  data  to  analyze  the  effectiveness  of  SSA  Cache  Core.  The  method 
for  generating  these  new  direct  transfer  traces  Is  described  In  subsection 
5.2.2,  and  the  formats  for  the  MSM  generated  records  used  by  the  MSMDRP  are 
described  In  subsection  5.4.3. 

Finally,  If  the  system  being  monitored  by  the  MSM  Is  configured  with  FWS 
|  Catalog  Cache  or  Is  utilizing  disk  In  core  space  tables,  a  data  record  Is 
generated  so  that  the  MSMDRP  can  report  on  the  effectiveness  of  FMS  Catalog 
j  Cache  or  In  core  space  table  buffering. 

7.3  Analytical  Methodology 

An  evaluation  of  the  Mass  Storage  Subsystem  reports  produced  by  the  MSMDRP 
requires  concurrent  use  of  the  reports  produced  by  the  Channel  Monitor  Data 
Reduction  Program  (CMDRP).  Chapter  14  provides  a  detailed  description  of 
the  procedure  to  be  followed  in  such  an  evaluation.  Subsection  8.3 
provides  a  detailed  description  of  the  entire  I/O  process,  and  the  traces 
generated  during  the  processing  of  .an  I/O  request.  In  general,  the  CMDRP 
Is  used  to  Identify  channels  and/or  devices  which  are  acting  as  bottlenecks 
to  the  efficient  operation  of  the  system,  while  the  MSMDRP  reports  are  used 
to  determine  the  exact  activities,  files,  and  file  codes  that  are  causing 
the  contention  uncovered  by  the  CMDRP  reports.  The  MSMDRP  reports  will 
also  Identify  those  devices  experiencing  seek  elongation  problems  and  the 
files  upon  these  devices  which  are  responsible  for  the  seek  elongation. 
Finally,  the  MSMDRP  reports  will  Identify  those  files  that  are  candidates 
for  device  relocation  or  placement  Into  Hard  Core  or  SSA  Cache  Buffer  space. 

Before  a  user  conducts  a  Mass  Storage  Subsystem  Evaluation,  It  Is  Important 
to  have  an  understanding  of  the  entire  I/O  process.  Subsection  8.3 
provides  a  detailed  description  of  the  entire  I/O  process  and  all  traces 
generated  during  the  processing  of  an  I/O  request.  In  this  subsection,  a 
description  of  only  the  connect  (trace  type  7)  event  will  be  presented. 

Each  time  a  system  program  or  application  program  Issues  an  I/O  request 
(read  disk/tape,  write  disk/tape,  seek,  etc.  .  .)  the  GCOS  system  will 
generate  a  trace  type  7  (connect  event).  Upon  the  occurrence  of  this 
event,  several  Internal  tables  are  updated  and  It  Is  these  tables  that  the 
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MSM  references  in  order  to  generate  its  data  record.  A  program’s  SSA  area 
contains  tables  for  the  Peripheral  Assignment  Table  (PAT;  and  the  PAT 
Pointer.  These  are  used  to  describe  the  device  and  space  allocation  for  a 
particular  file  and  the  file  code  to  correlate  a  user  file  code  to  the  PAT 
and  the  device  on  which  that  file  is  allocated.  The  .CRIO  and  .CRCT  tables 
contain  descriptive  information  concerning  device  and  channel 
configuration.  Finally,  the  program's  SSA  area  also  contains  an  area  which 
is  used  for  I/O  entries.  These  entries  are  each  11  words  long  and  contain 
detailed  information  concerning  the  I/O  just  requested.  They  are  referred 
to  as  the  11  word  I/O  queue  entry. 

I/O  requests  can  be  of  two  types  (single  or  multicommand).  Multicommands 
are  of  the  type  seek-read,  seek-write,  or  seek-write  verify.  Single 
commands  can  be  status  requests  of  certain  types,  or  reads/writes,  where 
seeks  are  not  required.  These  different  types  of  I/O  commands  are 
processed  and  reported  in  different  fashions  by  the  various  MSMDRP  reports 
(see  individual  output  reports).  Finally,  whenever  the  system  generates  a 
multi  I/O  command,  it  is  necessary  for  the  system  to  record  the  actual  seek 
address  being  requested.  Normally,  this  seek  address  is  stored  in  I/O 
queue  word  number  4.  Whenever  the  MSM  processes  a  multicommand,  it  expects 
to  find  a  valid  seek  address  at  this  location.  However,  there  are  certain 
occurrences  when  a  multicommand  is  issued  and  I/O  queue  word  4  does  not 
contain  a  valid  seek  address.  In  these  cases,  Bit  3 2  of  I/O  queue  word  2 
is  set  to  a  0. 

7*4  Data  Reduction  Methodology 

The  MSMDRP  currently  uses  random  I/O  (File  58)  to  process  histogram  data 
for  the  Device  Space  Utilization  and  Device  Seek  Movement  reports.  This 
feature  allows  the  MSMDRP  to  process  an  unlimited  number  of  devices  with  a 
minor  increase  in  memory  requirements.  As  delivered,  the  MSMDRP  will 
process  data  describing  75  mass  storage  devices  and  40  mass  storage 
| channels.  It  will  produce  130  unique  histograms  with  no  random  I/O.  If 
the  number  of  channels  or  devices  is  insufficient,  the  user  will  need  to 
edit  file  B29IDPX0/S0URCE/MSM.  The  user  should  enter  the  edit  subsystem 
and  process  the  following  command: 

B  RS:/NRDEVXX=XX75/;*:/NRDEVXX=XX  new  number  of  devices/ 

B  RS:/NRCHANXX=XX40/;*:/NRCHANXX=XX  number  of  new  channels/ 

For  each  additional  device,  the  size  of  the  program  will  increase  by  10 
words  and  for  each  additional  channel,  the  program  will  increase  by  45 
words.  For  the  above  edit,  the  character  "X"  signifies  a  space. 


The  next  variable  that  will  need  to  be  changed  is  RPTCNT.  This  number 
represents  the  total  number  of  histograms  and  reports  that  will  be 
processed  with  no  random  I/O.  To  calculate  the  value  required,  the 
following  formula  should  be  used. 

(number  of  devices  actually  configured) *2 +8 

Ilf  this  value  is  less  than  138  (65  disk  devices),  no  change  is  required. 

If  the  value  required  is  greater  than  138,  the  user  may  alter  this  value. 
This  will  help  to  limit  the  amount  of  random  I/O  being  performed  but  will 
|increase  storage  by  80  words  for  each  increment  above  138.  This  trade-off 
between  CPU/ 10  time  and  memory  must  be  made  at  the  discretion  of  the  user. 
In  order  to  change  this  value,  the  following  edit  function  should  be 
performed: 

|  B  RS:/RPTCNTX=X138/;*:/RPTCNTX=Xnew  value/ 

As  in  the  earlier  edit  example,  the  character  "X"  should  not  be  typed,  but 
is  being  used  to  represent  a  blank  column. 

After  performing  the  above  edits,  the  user  should  recompile  the  source 
program  by  entering  the  card  subsystem  and  issuing  a  run  command. 

7.5  MSMDRP  Output 

The  MSMDRP  produces  a  series  of  24  reports  listed  in  table  7-1  over  which 
the  user  has  limited  control.  Those  eight  reports  with  a  NAME=codename 
designation  offer  greater  parameter  control  to  the  user.  This  parameter 
control  will  be  described  in  subsection  7.6.  In  table  7-1,  the  file  nn 
designation  indicates  the  file  code  used  to  record  the  given  report  and  is 
of  no  real  concern  to  the  user.  In  addition,  a  series  of  messages  are 
produced  which  supply  the  user  with  information  concerning  special 
processing  events  that  occurred  during  the  execution  of  the  data  reduction 
program.  Most  of  these  processing  messages  are  for  information  only,  and 
can  be  ignored.  The  following  subsections  will  describe  all  the  reports 
listed  in  table  7-1,  and  subsection  7.5*25  will  describe  the  processing 
messages  that  may  be  produced  during  the  course  of  data  reduction. 

7.5*1  System  Configuration  and  Channel  Usage  Report  (Pile  42).  This 
report  documents  the  system  identification,  configuration,  and  the  date  and 
time  of  the  monitoring  period,  as  well  as  reporting  the  usage  of  all 

I  configured  disk  I/O  channels.  Tape  Channel  Usage  is  not  reported  by  the 
Mass  Store  Monitor.  Figure  7-1  is  an  example  of  this  report.  The  heading 
line  indicates  the  software  version  number  that  corresponds  to  this 
|document.  The  version  number  should  be  09-83  CHG-7.  The  first  line  after 
the  heading  provides  the  tape  number(s)  the  report  was  generated  from,  the 
system  identification,  the  date  (in  the  form  year,  month,  and  day  - 
YYMMDD),  and  the  start  and  stop  times  (HH:MM:SS)  of  the  MONITORING 
SESSION.  The  next  several  lines  of  output  describe  the  overhead  of  all  GKF 
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(IN  THE  ACTUAL  REPORT  A  SUMMARY  OF  IOM  NUMBER  I  REPORT  WOULD  F0LL01 
Figure  7-1-  System  Conf igura tion  and  Channel  Usage  Report 


monitors  that  were  active  during  data  collection.  The  monitor  name  is 
given,  its  CPU  time  in  seconds,  and  its  overhead  as  a  function  of  total 
processor  power.  The  GMF  executive  overhead  is  separated  from  the  actual 
monitors  and  is  listed  as  "EXEC".  The  monitor  "NAME"  is  an  area  of  code 
within  the  Mass  Store  Monitor  and  even  though  listed  separately  it  is  also 
included  under  the  monitor  "MSM".  The  monitor  "FMS"  is  also  an  area  of 
code  within  the  Mass  Store  Monitor,  but  in  this  case  it  has  not  been 
included  under  the  monitor  "MSM". 

Monitor  "CM"  in  this  report  describes  the  processor  overhead  of  subroutine 
T4  (terminate  processing)  and  subroutine  T22  (start  I/O  processing). 

Monitor  "MSM"  in  this  report  describes  the  processor  overhead  of  subroutine 
T7  (connect  processing).  Therefore,  if  the  Channel  Monitor  was  active,  but 
the  Mass  Store  Monitor  was  not,  this  report  will  still  list  both  "CM"  and 
"MSM"  as  contributing  to  the  processor  overhead.  The  total  Channel  Monitor 
overhead  will  be  found  by  adding  the  overhead  of  the  "CM”  monitor  to  the 
overhead  of  the  "MSM"  monitor,  to  the  overhead  of  the  "FMS"  monitor. 

If  both  the  Channel  Monitor  and  Mass  Store  Monitor  were  active,  then  the 
combined  overhead  of  both  monitors  can  be  found  as  the  sum  of  "MSM"  +  "CM" 

+  "FMS". 

For  purposes  of  this  report,  %  overhead  is  computed  as: 

(CPU  TIME  Used  by  Monitor) _ 

(TOTAL  Elapsed  Time)x( Number  of  Processors) 

|  Following  the  overhead  description  are  six  lines  of  configuration 
information  describing  the  number  of  processors,  IOMs,  and  amount  of  memory 
configured  to  the  system.  In  addition,  the  size  of  GCOS  Hard  Core,  the 
size  of  the  Core  Allocator  and  the  size  of  FILSYS  is  also  presented.  The 
|  fourth  line  of  the  configuration  data  indicates  the  number  of  processors 
actually  configured  and  actually  available.  These  numbers  might  be 
different  than  shown  on  the  first  line  due  to  the  assigning  and  releasing 
of  processors.  In  figure  7-1,  we  see  that  one  processor  was  released  for  a 
period  of  time  (i.e.,  CPUs  actually  available  is  equal  to  1.75).  The 
actual  time  that  processors  were  available  or  released  is  indicated  in  the 
status  message  printouts  (see  subsection  7.5*25) • 

The  final  two  lines  of  the  report  indicate  how  many  simultaneous  intercom 
I/Os  are  permitted  and  the  maximum  number  of  outstanding  intercom  I/Os 
recorded  during  the  monitoring  session.  The  intercom  I/O  feature  is  the 
means  by  which  two  programs  residing  in  the  H6000  can  pass  data  to  one 
) another.  If  the  system  exhausts  all  available  intercom  I/O  buffers,  then 
the  programs  requiring  this  facility  will  be  delayed.  These  two  figures 
would  provide  an  indication  that  the  system  did  indeed  exhaust  all 
available  intercom  I/O  buffer  space  whenever  both  lines  contain  the  same 
number.  Whenever  this  occurs,  it  shows  that  the  entire  buffer  table  was 
exhausted  and  therefore,  the  probability  is  high  that  jobs  are  being 
delayed  while  they  are  waiting  for  buffer  space  to  free. 
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In  addition,  whenever  the  number  of  outstanding  intercom  I/Os  is  found  to 
be  equal  to  the  total  buffer  capability,  a  warning  message  will  be  printed 
on  the  status  message  printout  file.  This  message  will  indicate  the  time 
of  day  that  the  buffer  pool  was  exhausted  and  a  second  printout  will  occur 
whenever  the  number  of  outstanding  intercom  I/Os  falls  back  below  the 
maximum  allowed. 

The  next  portion  of  the  report  documents  the  channel  configuration  by  IOM, 
|listing  each  configured  disk  channel  number,  the  disk  device  type 
configured  to  that  channel,  and  the  channel  crossbarring.  The  crossbar 
column  shows  those  channels  that  are  crossbarred  to  the  channel  identified 
under  the  channel  column.  If  SEE  ABOVE  is  found,  the  crossbarring  has  been 

I  displayed  on  a  preceding  channel.  The  I-CC  format  displayed  under  the 
CHANNEL  heading  identifies  the  IOM  and  the  channel  number.  The  last  column 
of  this  report  displays  the  number  of  all  connect  types  issued  over  that 
channel.  This  section  will  be  repeated  for  each  IOM  configured  to  the 
system.  Figure  7-1  only  displays  IOM  0  activity.  This  report  is  always 
generated  and  cannot  be  turned  off. 
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Figure  7-2.  MSM  System  Summary  Report 


7*5*2  System  Summary  Report  (Pile  42).  The  System  Configuration  and 
Channel  Usage  Report  and  the  System  Summary  Report  may  be  used  to  assess 
overall  system  utilization.  Figure  7-2  is  an  example  of  the  System  Summary 
Report.  The  first  set  of  lines  shows  the  number  of  connects  to  each 
monitored  mass  storage  subsystem  compared  to  the  total  connects  issued  to 
all  Mass  Storage  subsystems  and  the  connect  rate  per  hour  over  each 
subsystem.  Most  systems  will  show  a  small  number  of  Control  Connects  being 
generated  by  the  MFCs  configured  to  the  system.  These  Control  Connects 
will  be  summed  together  and  listed  as  a  separate  subsystem  line.  Analysis 
on  a  Shared  Mass  Storage  System  shows  the  numoer  of  MFC  connects  generated 
to  be  a  significant  percentage  of  the  total  connects  generated.  The  next 
| lines  show  the  breakdown  of  the  mass  storage  connects  by  the  IOM  over  which 
they  were  issued.  The  final  part  of  this  report  is  a  list  of  the  commands 
(octal  code  and  mnemonic)  issued  to  the  mass  storage  subsystem  and  the 
count  of  each  issued  during  the  monitoring  session.  This  report  is  always 
generated  and  cannot  be  turned  off. 

A  well  performing  system,  under  a  heavy  workload,  should  show  a  high 
utilization  of  the  configured  resources.  Figure  7-2  shows  that  the  I/O 
activity  is  predominantly  on  the  MSU450  subsystem  configured  on  channels  8 
|  and  9  of  IOM  0  and  1  (see  figure  7-l).  The  MSU450s  are  receiving  55%  of 
all  connects  and,  therefore,  should  be  the  major  area  of  concern.  The 
access  rate  for  every  subsystem  is  reported  on  the  top  of  the  System 
Summary  Report  and  it  can  be  seen  that  the  MSU450s  have  an  access  rate 
significantly  higher  than  the  other  subsystems.  All  signs  indicate  that  if 
system  throughput  is  being  affected  by  disk  activity,  then  the  MSU450s 
would  be  the  probable  cause  of  such  problems. 

I  The  next  item  to  check  should  be  the  channel  usage.  The  two  highest  used 
logical  channels  of  any  subsystem  should  be  on  a  separate  PSI  channel  of  a 
| two-PSI  channel  subsystem.  If  the  highest  used  logical  channels  are  not  on 
separate  PSI  channels,  the  $  XBAR  card  in  the  startup  configuration  section 
is  suspected  as  the  cause.  The  channels  are  used  in  the  order  given  on  the 
$  XBAR  card  (i.e.,  if  the  primary  channel  is  busy,  the  next  channel  tried 
is  given  on  the  crossbar).  The  alternate  use  of  PSI  channels  for  maximum 
simultaneity  mu3t,  therefore,  be  appropriately  specified  in  the  boot  deck. 
Subsection  8.3  provides  a  detailed  explanation  for  analyzing  the 
correctness  of  the  crossbar  configuration. 

While  looking  at  the  System  Summary  Report,  it  is  also  of  interest  to  note 
| the  ratio  of  READ  commands  to  WRITE  commands  (over  three  to  one  in  this 
example).  This  gives  an  indication  of  the  nature  of  the  usage  of  the  mass 
storage  space.  A  quick  look  at  the  number  of  write/verify  (WR-VER) 
commands  executed  is  also  of  interest  as  they  are  essentially  double 
(WRITE,  then  READ)  data  transfer  commands  which  require  more  device  and 
channel  time. 

The  general  fraction  of  utilization  for  each  logical  channel  gives  an 
indication  of  the  degree  of  simultaneity  of  access  to  the  subsystem.  If 
only  N  of  the  configured  logical  channels  have  nonzero  counts,  then  there 
were  never  more  than  N  accesses  being  performed  simultaneously  by  the 
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subsystem.  The  proportional  relationships  among  the  counts  of  accesses 
made  over  each  of  the  logical  channels  are  quantitative  indications  of  the 
frequency  of  occurrence  of  specific  levels  of  simultaneity.  As  an  example, 

I  assume  that  the  I0M-01  data  reported  in  figure  7-1  (not  shown  in  figure) 
indicated  that  4487  connects  out  a  total  of  107,273  connects  went  to 
Channel  9,  IOM  1.  This  means  that  only  4487  times  during  the  measuring 
period  were  all  four  MSU450  disk  channels  being  utilized  simultaneously. 

In  this  example,  channel  queuing  (i.e.,  shortage  of  channel  power)  would 
not  appear  to  be  a  problem.  This  is  not  to  infer  that  device  queuing  is 
not  a  problem,  just  that  channel  queuing  does  not  appear  to  be  a  problem. 

If  the  number  of  accesses  to  the  lowest  priority  channel  is  a  larger 
percentage  of  the  total  accesses,  then  channel  queuing  needs  to  be 
examined.  Queuing  for  devices  and/or  channels  can  be  analyzed  by  running 
the  Channel  Monitor  Data  Reduction  Program  (see  section  8). 

7.5.3  System  Traces  Captured  by  Monitor  Report  (File  42).  This  report 
contains  the  number  of  occurrences  of  each  specific  trace  type  recorded  on 
the  data  collector  tape  processed  by  the  MSMDRP  (figure  7-3).  This  report 
provides  little,  if  any,  information  required  by  the  user  for  his 
analysis.  This  report  is  always  generated  and  cannot  be  turned  off. 

7.5.4  Channel  Status  Changes  Report  (File  29).  This  report  lists  the 
initial  status  for  all  tape  and  disk  channels  configured  to  the  system 
(figure  7-4).  If,  during  the  course  of  the  monitoring  session,  a  given 
channel  or  IOM  was  dropped  or  added  to  the  system  (dynamic  reconfiguration) 
a  new  report  will  be  produced  indicating  the  activation  or  deactivation 
changes  and  the  time  that  the  change  occurred.  Finally,  this  report  will 
indicate  whether  the  SSA  cache  option  and  FMS  cache  option  are  active,  and 
if  so,  will  indicate  their  initial  status  and  any  changes  that  occur  to 
that  status.  If  a  given  option  is  not  active,  a  zero  will  be  reported  for 
each  of  the  values.  This  report  is  always  generated  and  cannot  be  turned 
off. 


7.5.5  Physical  Device,  Device  ID  Correlation  Table  (File  42).  Each  mass 
storage  device  configured  in  the  system  is  listed  with  a  unique  device  ID. 
A  typical  report  is  presented  in  figure  7-5*  This  unique  device  is  needed 
since  different  devices  can  have  the  same  device  number  on  the  Honeywell 
6000.  (See  Device  ID  1,  Device  ID  7,  and  Device  ID  18  of  figure  7-5). 
These  unique  numbers  are  referenced  in  several  reports  produced  by  the 
MSMDRP.  This  report  is  always  generated  and  cannot  be  turned  off. 

7.5*6  Device  Space  Utilization  Report  (File  42).  The  device  space 
utilization  histogram  report  is  produced  for  every  device  on  the  mass 
storage  subsystem  and  shows  the  distribution  of  access  to  the  device 
space.  Figure  7-6  is  an  example.  It  should  be  noted  that  the  name  of  the 
device  is  also  given.  This  example  presents  all  connects  made  to  the 
device  with  the  name  RF5.  If  an  exchange  took  place  and  the  RF5  disk  pack 
was  moved  from  0-08-05  to  0-08-01  the  data  reduction  program  will  account 
for  that  exchange  and  any  connects  that  are  made  to  0-08-01  will  be 
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reported  on  this  histogram  and  not  to  the  0-08-01  histogram.  Entries  in 
the  column  headed  CYLNDR  NUMBER  give  the  range  of  cylinders  which  form  each 
histogram  bucket.  The  number  of  cylinders  in  each  bucket  is  a  function  of 
the  device  type.  The  entries  in  the  column  headed  INDIV.  NUMBER  give  the 
number  of  accesses  made  to  that  device  within  the  physical  space  defined  by 
the  range  of  cylinders. 

Similarly,  the  columns  headed  INDIV.  PRC  and  CUMUL.  PRC  give  the  individual 
and  cumulative  percentages  of  all  accesses  to  that  device  which  were  made 
within  each  cylinder  range.  The  graphic  portion  of  the  display  gives  a 
visual  indication  of  the  percentage  of  accesses  which  were  made  for  each 
range  of  the  device  space.  This  helps  to  quickly  assess  the  access  pattern 
of  the  usage  for  the  device,  i.e.,  whether  the  device  is  totally  allocated 
and  used  or  locally  used.  Figure  7-6  shows  a  device  whose  usage  is  split 
between  two  extremes. 

In  the  upper  right  hand  comer  of  the  report,  a  report  number  is 
indicated.  This  report  number  is  used  only  to  distinguish  one  histogram 
from  another  and  in  no  way  indicates  the  device  to  which  the  report 
refers.  In  addition,  report  numbers  may  not  appear  sequentially  and  this, 
in  no  way,  is  indicative  of  a  problem.  This  report  is  always  generated  and 
cannot  be  turned  off. 

7-5*7  Device  Seek  Movement  Report  (File  42).  The  seek  movement  histogram 
is  produced  for  devices  in  the  mass  storage  subsystem  being  analyzed  and 
provides  the  distribution  of  distance  traveled  by  the  read  mechanism. 

Figure  7-7  is  an  example.  The  data  used  to  generate  this  report  is  the 
absolute  value  of  the  difference  between  the  cylinder  addresses  of  each 
successive  access  to  the  given  device.  The  column  headed  CYLNDR  MOVED 
contains  the  range  of  seek  movement  distance  for  each  line  of  the  report. 
The  column  headed  INDIV.  NUMBER  contains  the  counts  of  the  number  of 
accesses  which  caused  the  arm  to  be  moved  that  distance.  Figure  7-7  shows 
854  accesses  caused  no  arm  movement  (the  same  cylinder  was  successively 
accessed)  for  IOM-O  Device  05  on  PUB  8.  The  INDV.  PRC  and  CUMUL.  PRC 
columns  give  the  individual  and  cumulative  percentages  of  the  accesses  to 
that  device  which  resulted  in  a  particular  range  of  seek  movement. 

When  using  this  report  to  analyze  seek  elongation  problems,  it  is  fairly 
easy  to  determine  the  location  of  files  that  are  causing  the  seek 
elongation  problem.  For  example,  figure  7-7  shows  265  (12.8  percent)  of 
the  accesses  to  the  device  resulted  in  an  arm  movement  of  between  714-750 
cylinders.  If  figure  7-6  is  examined,  it  will  be  observed  that  564 
connects  were  made  to  cylinder  0  and  215  connects  were  made  to  cylinders 
714-750.  If  the  accesses  to  these  2  areas  were  made  in  alternating 
fashion,  the  resulting  arm  movement  distance  would  be  714-750  cylinders. 

In  addition,  figure  7-6  also  shows  207  accesses  to  cylinders  54-5 0  and  612 
accesses  to  cylinders  748-764.  Once  again,  if  accesses  to  these  two  areas 
were  made  in  alternating  fashion,  the  resulting  arm  movement  distance  would 
be  714-750  cylinders.  By  using  other  reports  produced  by  MSMDRP,  the 
analyst  can  next  determine  the  actual  files  located  in  these  areas,  and 
perhaps  relocate  one  or  more  files  so  as  to  eliminate  this  excessive 
seeking. 
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The  above  procedure  becomes  more  complicated  when  analyzing  disk  types  500 
and  501.  This  device  type  consists  of  two  logical  devices  configured  as 
one  physical  device.  Access  to  one  member  of  the  pair  may  affect  the 
performance  of  the  other  member  because  of  the  fact  that  the  hardware 
design  considers  the  two  devices  to  be  a  single  physical  device.  In  figure 
7-7.3,  it  can  be  observed  that  9  accesses  to  that  device  resulted  in  an  arm 
movement  of  545-561  cylinders.  If  figure  7-7.2  is  examined  next,  it  can 
readily  be  seen  that  the  access  pattern  to  this  device  could  never  result 
in  arm  movement  of  between  545-561  cylinders.  However,  if  figure  7-7.1 
is  checked,  it  will  be  observed  that  7918  connects  did  occur  on  cylinders 
545-561.  Therefore,  we  now  see  an  example  of  seek  movement  conflict 
between  device  pairs,  as  opposed  to  conflict  on  a  single  device.  In  order 
to  analyze  this  situation,  it  would  be  necessary  to  determine  the  files 
located  at  cylinders  545-561  on  device  MSH  and  also  the  files  located  at 
cylinder  0  on  device  MSI. 

Further  confirmation  of  a  seek  elongation  problem  could  be  found  by 
analyzing  the  Head  Movement  Efficiency  Report  (see  subsection  7.5*8  and 
figure  7-8).  If  a  problem  exists  then  the  connects/arm  movement  column  for 
| a  particular  device  should  approach  a  value  of  one.  Figure  7-8  indicates 
how  many  connects  are  issued  between  each  movement  of  the  arm.  Therefore, 
even  though  we  may  have  long  seeks  occurring  on  the  device,  if  a  large 
number  of  connects  are  being  processed  between  these  seeks,  this  would  tend 
to  lessen  the  impact  of  the  long  seeks. 

For  the  Device  Space  Utilization  Report  and  Device  Seek  Movement  Report,  an 
entry  is  made  only  for  multi-command  connects  (see  subsection  7.3)  such  as 
a  seek/read  or  seek/write.  If  the  first  command  of  an  10  connect  is  not  a 
seek,  or  pre-seek,  then  an  entry  will  not  be  made  to  this  set  of 
histograms.  For  this  reason,  the  number  of  connects  reported  in  these 
reports,  for  a  given  device,  may  be  somewhat  lower  than  that  reported  in 
the  Proportionate  Device  Utilization  Histogram,  described  in  subsection 
|7.5.20. 

7.5.8  Head  Movement  Efficiency  Report  (File  42).  This  report  displays  how 
many  connects  are  issued  per  arm  movement  of  the  device.  Figure  7-8 

((generated  from  a  different  monitoring  session  than  that  used  to  obtain 
figures  7-6,  7-7,  7-7.1,  7-7.2  and  7-7.3)  is  an  example.  The  first  three 
columns  give  the  IOM,  Channel,  and  Device  number  of  the  device.  This  is 
followed  by  the  number  of  connects  issued  to  that  device  and  the  number  of 
times  any  arm  movement  was  required  (size  of  the  seek  is  not  considered). 
The  final  column  indicates  the  ratio  of  connects  to  arm  movements.  The 
larger  this  ratio  is,  the  more  efficient  is  the  device  (i.e.,  the  larger  is 
the  number  of  connects  being  handled  between  each  arm  movement  of  the 
device).  Following  the  breakdown  of  arm  movement  by  individual  device,  a 
summary  is  presented  for  arm  movement  within  each  individual  mass  storage 
subsystem.  This  is  followed  by  three  lines  of  output  summarizing  the 
overall  efficiency  of  the  entire  disk  subsystem.  The  first  line  presents 
the  total  number  of  connects  issued,  the  second 
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Figure  7-8.  Head  Movement  Efficiency  Report 


line  presents  the  total  number  of  arm  movements,  irrespective  of  the  number 
of  cylinders  traversed  by  each  movement,  and  the  final  line  prerents  the 
|  overall  head  movement  efficiency  (line  1  divided  by  line  2).  By  examining 
figure  7-8,  the  user  will  observe  that  the  MSU450s  appear  to  be  least 
efficient,  with  the  exception  of  device  number  2.  This  report  is  always 
generated  and  cannot  be  turned  off. 

7*5«9  System  File  Use  Summary  Report  (File  21).  This  report  indicates 
where  each  system  file  is  located  and  to  what  extent  it  was  accessed  across 
|  the  measurement  session.  Figure  7-9  shows  an  example  of  this  report.  This 
report  is  produced  by  default  but  can  be  turned  off  with  the  input  option 
(OFF)  (subsection  7.6.9).  The  sum  of  accesses  to  all  system  files  is  then 
expressed  as  a  percentage  of  all  mass  storage  accesses.  The  system  files 
are  those  files  defined,  via  startup  in  the  .CRDIT  table.  As  can  be  seen 
in  figure  7-9,  the  file  names  listed  under  the  File  Name  column  are  not  the 
actual  file  name,  but  rather  relative  file  names  indicative  of  their 
position  description  within  the  startup  deck.  Actual  file  names  can  be 
output  in  this  report  if  the  user  selects  the  input  option  described  in 
subsection  7*6.4. 

Included  in  the  list  of  system  files  are  two  additional  file  names  of  AC TNG 
and  SYSOUT.  These  files  indicate  accesses  to  the  accounting  file  and  any 
configured  SYSOUT  files.  For  these  files,  the  information  provided  under 
STARTING  SECTOR/CYLINDER  is  not  the  actual  starting  address,  but  rather  the 
smallest  address  accessed  during  the  monitoring  session.  Likewise,  the 
information  provided  under  the  LENGTH  column  is  not  the  actual  length  of 
the  file,  but  rather  the  difference  between  the  largest  and  smallest 
addresses  accessed  during  the  monitoring  session. 

This  list  of  system  files  is  then  followed  by  a  list  of  modules  which 
reside  in  hard  core  because  they  are  hard  core  modules  or  because  they  have 
been  loaded  into  hard  core  by  system  personnel  in  order  to  save  on  I/O 
processing.  If  a  system  module  is  not  loaded  in  hard  core,  and  is  required 
for  some  processing,  then  the  system  must  perform  an  10  function  to  read 
this  module  from  disk  into  a  user's  SSA  work  space.  A  significant  amount 
of  such  system  10  can  cause  severe  system  degradation.  This  degradation 
can  be  reduced  by  placing  additional  system  modules  in  hard  core  or  else  by 
increasing  the  size  of  SSA  Cache  Memory.  The  System  File  Use  Summary 
Report  and  the  following  Individual  Module  Activity  Report  should  provide 
sufficient  information  to  determine  whether  user  action  is  required  to 
reduce  system  10  overhead.  If  the  percentage  of  system  10  reported  in  this 
report  is  greater  than  5-1%,  then  some  user  action  is  probably  required. 

If  addtional  hard  core  space  is  available,  the  user  should  move  as  many 
system  modules  as  possible  into  hard  core.  Each  hard  core  module  requires 
1/2K  (512  words)  of  memory,  and  there  is  64K  of  hard  core  memory 
available.  The  System  Configuration  and  Channel  Usage  Report  (subsection 
7.5.l)  indicates  the  amount  of  hard  core  currently  in  use.  The  Individual 
Module  Activity  Report  (subsection  7*5.10)  can  be  used  to  indicate  which 
system  modules  should  be  transferred  to  hard  core.  If  sufficient  hard  core 
is  not  available,  then  the  size  of  SSA  cache  should  probably  be  increased. 
Once  again,  the  Individual  Module  Activity  Report  should  be  referenced  to 
see  if  this  type  of  action  would  aid  in  reducing  system  10. 
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Figure  7-9-  System  File  Use  Summary  Report 


7.5*10  Individual  Module  Activity  Report  (File  2l).  This  report  shows  the 
accessing  done  to  each  system  module  (figure  7-10).  The  report  presents 
the  system  file  the  module  resides  in,  followed  by  the  module  name  and 
type.  The  module  location,  access  count,  and  percentage  of  system  file 
usage  is  then  given.  The  last  two  entries  give  the  total  number  of  SSA 
CACHE  buffer  hits  and  disk  loads  this  module  accumulated  (these  values  are 
0  if  SSA  CACHE  is  not  active).  A  minus  one  in  column  indicates  a 
nonstandard  SSA  module.  The  Number  of  Accesses  column  reports  the  number 
of  connects  made  to  this  SSA  module  as  determined  by  the  issuance  of  a 
trace  type  7.  The  Disk  Load  column  reports  the  number  of  times  the  SSA 
CACHE  logic  claims  to  have  issued  a  connect  to  this  SSA  module.  In  those 
cases  where  no  lost  data  occurred  during  the  monitoring  session  and  a  GMC 
termination  record  was  generated  (i.e.,  GMC  terminated  correctly),  these 
two  columns  should  display  equal  values.  Figure  7-10  shows  this  to  be  the 
case  for  almost  all  of  the  SSA  modules.  However,  there  are  some 
exceptions.  Module  .MALC6  shows  109  accesses  but  only  104  disk  loadB 
(i.e.,  a  difference  of  five).  This  apparent  inconsistency  has  been 
reported  to  Honeywell,  and  an  explanation  requested. 

If  lost  data  occurred  during  the  monitoring  session  (i.e.,  trace  type  7 
data  lost),  the  Number  of  Accesses  column  cou'.d  be  significantly  lower  than 
the  Disk  Load  column.  If,  on  the  other  hand,  GMC  did  not  abort  cleanly  and 
a  termination  record  was  not  generated,  the  Disk  Load  column  could  be 
significantly  lower  than  the  Number  of  Accesses  column. 

The  module  names  are  all  hard  coded  in  the  MSM  Data  Reduction  Program.  To 
verify  that  the  module  names  are  correct,  especially  for  commercial 
systems,  the  following  GMAP  program  should  be  run  to  produce  a  module  list 
and  number.  This  list  can  then  be  compared  to  the  list  in  the  routine 
BLOCK  DATA.  The  GMAP  program  is  as  follows: 

$  GMAP  NDECK 

$  LODM  .G3MAC 

LODM  .G3MCR 
.MDEF1 .LIST 
$  S7MDEF  START 

START  NULL 
$  END 

Finally,  it  should  be  noted  that  data  from  the  last  two  columns  pertains 
only  to  "STANDARD  SSA"  modules  (see  the  Type  column).  Modules  that  are 
typed  as  "ABSOLUTE"  or  "EXCEPT  PROC"  are  not  placed  into  SSA  Cache  Core  and 
therefore  do  not  generate  values  for  the  last  two  columns.  This  report  is 
produced  by  default  but  can  be  turned  off  with  the  input  option  (OFF) 
(subsection  7.6.9). 

When  generating  this  report,  the  data  reduction  program  writes  to  file  code 
55,  which  is  used  to  produce  a  Job  SSA  Module  Usage  Report  (subsection 
7.5*11).  IT  this  report  is  desired,  the  report  must  be  requested  via  input 
option  (MODULE)  (subsection  7.6.6). 
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At  the  bottoo  of  this  report,  a  summary  line  is  produced  indicating  the 
percentage  of  buffer  hits  and  disk  loads.  If  the  percentage  of  buffer  hits 
is  less  than  90,  then  the  size  of  SSA  cache  should  be  increased.  For  each 
IK  increase,  2  additional  modules  will  be  loaded  into  the  SSA  Cache  memory. 

When  using  this  report  to  determine  which  additional  SSA  modules  should  be 
placed  into  GCOS  Hard  Core,  the  user  should  reference  the  "%  of  Activity" 
column.  Those  modules  with  the  largest  reported  figure  would  be  candidates 
for  movement.  In  figure  7-10,  .MALC6,  .MALC9,  .KFS03,  .MFS04  would  be 
candidates  for  movement. 
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7.5*11  SSA  Module  Uaage  Report  by  Job  (File  2 l).  When  requested  by  the 
user,  this  report  will  produced  a  listing  for  every  job  run  during  the 
monitoring  period,  showing  all  SSA  modules  referenced  by  that  job  and  the 
number  of  such  references.  An  example  is  shown  in  figure  7-11.  This  is 
the  best  method  to  use  when  determining  which  SSA  modules  should  be 
softloaded  into  TSS  core.  This  also  provides  an  excellent  means  for 
studying  the  usage  of  SSA  modules  in  general.  This  report  is  off  by 
default  and  must  be  requested  with  a  user  input  option  (NODULE)  (subsection 
7.6.6). 


7*5.12  Pile  Code  Summary  Report  (File  23)  (NAMB*FILBCODE).  The  File  Code 
Summary  Report  lists,  by  each  activity,  the  files  allocated  to  mass 
storage,  their  location  and  size,  and  the  number  of  accesses  made  to  each 
in  the  system  during  the  monitoring  period.  Figure  7-12  is  an  example  of 
this  report.  The  activities  in  this  report  are  in  the  same  order  as  they 
appear  in  the  Activity  Summary  Report  (see  subsection  7.5.15). 

Each  activity  is  identified  by  its  SNUMB,  activity  number,  and  $  IDENT  and 
USERID  cards.  There  are  as  many  data  lines  as  necessary  to  describe  each 
mass  storage  file  used  by  the  activity  and  the  number  of  times  the  file  was 
accessed.  There  is  one  line  per  file,  and  the  file  is  described  by  its 
two-character  file  code,  the  device  on  which  it  was  allocated  (ALLOCATED 
DEVICE),  its  origin  on  that  device  (FILE  ORICIH)  in  units  of  LLINKS  (520 
words)  and  cylinders  relative  to  the  beginning  of  the  device,  and  the  size 
of  the  file  (FILE  SIZE)  in  LLINKS  and  cylinders.  The  column  headed 
CONNECTS  gives  the  count  of  the  number  of  accesses  made  to  the  file. 


There  are  several  special  file  codes  that  will  appear  in  this  report.  The 
following  file  codes  will  appear  for  almost  every  activity  that  is 
processed : 


00  -  Mass  storage  accesses  made  without  a  normal  PAT  entry;  e.g., 

accesses  made  by  the  operating  aystem  as  part  of  job  intialization 
which  are  done  without  a  PAT  for  efficiency 

—  -  All  accesses  to  SYSOUT 

■1  -  File  grow,  PAT  refresh,  permanent  allocation 
■7  -  Temporary  allocation 
-9  -  SVAP/GBCALL 

Of>  -  Load,  pop,  or  push  of  an  SSA 
*,  -  FILSYS  catalog  search  connect 
J*  -  JCL  file 


*J  -  Data  subfiles  for  a  job 

■A  -  Accesses  to  the  accounting  file  -  normally  made  fay  $CALC 
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Figure  7r12.  File  Code  Sumuiy  Report 


The  file  code  report  for  the  TSS  subsystem  will  also  contain  several 
special  file  codes.  Following  is  a  list  cf  these: 

TU  -  user  files  requested  under  TSS  subsystems. 

-  file  code  referenced  by  TSS  that  contains  a  character  that  cannot 
be  printed.  The  symbol  is  being  used  instead. 

JO  -  TSS  I/O  to  this  file  code  is  JOUT  processing. 

It  should  be  realized  that  these  file  codes  are  not  necessarily  used  for  a 
unique  file  request,  and  therefore  may  actually  reference  several  different 
files.  Therefore,  in  figure  7-12  it  can  be  seen  that  activity  1  for  SNUMB 
2802T  made  479  connects  to  file  code  00  located  on  device  1-10-2.  In 
actuality,  the  activity  may  have  connected  to  several  different  devices, 
each  time  referencing  this  special  file  code.  Instead  of  reporting  these 
connects  as  separate  entries,  all  the  requests  are  grouped,  and  the 
allocated  device  that  is  reported  is  the  the  one  that  was  referenced  by  the 
first  of  the  479  connects.  There  is  a  user  option  available  (AHEA)  that 
expands  the  usage  definition  of  these  file  codes.  This  option  will  then 
display  all  unique  requests  using  these  file  codes  rather  than  grouping  all 
of  the  unique  file  codes  into  a  single  collection  file  code.  In  addition, 
it  should  be  noted  that  the  information  reported  under  the  file  size  column 
(for  these  special  system  files  and  TSS  files)  is  normally  reported  as  0, 
and  the  information  reported  under  the  origin  column  (for  these  same  files) 
is  not  the  actual  origin  of  the  file,  but  rather  the  location  of  the  first 
seek  address  made  to  that  file. 

This  report  may  produce  excessive  output  and  therefore  is  not  produced 
under  default  conditions.  The  user  must  explicitly  request  this  report 
with  the  use  of  an  input  option  (OK)  (see  subsection  7.6.10). 

Each  file  code  entry  has  information  which  indicates  the  type  of  file  by  a 
"P"  for  permanent  and  "T"  for  temporary.  This  entry  is  located  on  the 
right  side  of  the  file  code  characters.  Immediately  to  the  right  of  the 
file  type  character  is  the  access  characteristics  of  the  file.  This  is 
denoted  by  an  "R"  if  random,  or  an  "S"  for  a  sequential  file.  The  next 
character  defines  a  permanent  file  as  cataloged,  "C",  or  noncataloged, 

"N".  This  character  for  a  temporary  file  will  be  a  blank  as  it  is  neither 
cataloged  nor  noncataloged.  GCOS  files  are  usually  defined  as 
noncataloged,  permanent  files  and  are  treated  at  startup  time  much  like 
permanent  files  by  the  file  system.  They  are  not,  however,  given  the 
allocation  treatment  during  normal  operation  time  that  is  given  to 
permanent  files.  The  final  character  is  an  "F"  for  fixed  or  an  "R"  for 
removable. 

ROTE:  A  file  size  of  1 6384  for  a  nonsystem  file  implies  the  file  size  is 
greater  than  16383  llinks. 

7.5*13  Cat/File  String  Report  (File  23).  Immediately  following  the  File 
Code  Summary  Report,  the  user  will  find  the  CAT/File  String  Report  (figure 
7-13)*  This  report  will  provide  the  CAT/File  String  for  every  permanent 
user  file  referenced  during  execution.  It  will  not  list  the  CAT/File 
String  for  any  special  system  files.  In  addition,  it  will  indicate  the 
total  number  of  connects  required  to  locate  the  file  (catalog  searching) 
every  time  the  system  was  required  to  search  for  that  file  via  a  KME 
GEFSTE.  Finally,  it  will  indicate  the  number  of  connects  required  to 
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This  report  will  display  all  connects  issued  by  a  job  with  no  regard  to  the 
type  of  I/O  command  or  the  validity  of  the  seek  address  (see  subsection 
7-3). 


7. 5*16  Device  Area  File  Code  Reference  Report  (File  22).  This  report  is 
generated  to  provide  details  on  the  jobs  accessing  a  specific  device  area 
with  their  file  codes.  Figure  7-16  displays  an  example.  The  devices  and 
areas  to  be  listed  are  defined  by  the  user  when  requesting  input  option 
Area  (subsection  7*6.1).  In  figure  7-16,  there  are  10  areas  requested  for 
investigation.  Each  activity  that  accessed  a  device  area  is  displayed  in 
the  report.  At  the  end  of  the  report,  the  number  of  connects  found  to  each 
requested  area  is  also  given.  This  report  is  identical  in  format  to  the 
File  Code  Summary  Report  (subsection  7.5*12)  except  that  this  report 
contains  only  the  file  codes  which  referenced  the  specific  area  of  the 
desired  devices.  The  AREA  N  of  each  file  code  specifies  within  which  area 
of  the  possible  set  of  requested  areas  this  particular  file  code  fell. 

When  this  option  is  selected,  the  file  code  reference  will  automatically  be 
expanded  and  special  system  file  codes  will  be  reported  only  if  they 
actually  referenced  the  requested  area  (see  subsection  7* 5*12).  In 
addition,  if  the  special  system  file  codes  referenced  multiple  areas,  these 
file  codes  will  appear  multiple  times  within  this  report.  In  figure  7-16, 
it  can  be  seen  that  activity  3  of  job  52323  has  multiple  references  to  file 
code  Of.  In  the  standard  File  Code  Summary  Report,  all  these  references 
would  have  been  grouped  as  a  single  reference,  but  in  this  report,  they  are 
expanded  within  each  unique  area  requested  by  the  user. 

A  complete  explanation  of  the  special  file  codes  can  be  found  in  subsection 
7.5*12.  This  report  is  not  produced  under  default  conditions  and  must  be 
requested  with  a  special  user  input  option  (AREA)  (subsection  7*6.1). 

7.5.17  Device  File  Use  Summary  Report  (File  2l).  This  report  shows  the 
device  use  by  the  accesses  per  file  class  (temporary  or  permanent).  Figure 
7-17  is  an  example  of  this  report.  Each  of  these  classes  of  allocation  is 
subdivided  into  sequential  and  random  files  and  their  corresponding 
percentage  of  the  total  file  use  is  presented  in  the  report.  File  00 
accesses  are  not  included  in  this  report.  This  report  will  reflect  only 

|  multicommand  connects.  The  device  numbers  being  reported  under  the 
"DEVICE"  column  are  the  unique  set  of  device  numbers  generated  by  the 
MSMDRP  (see  subsection  7.5*5).  This  report  is  on  by  default  but  may  be 
turned  off  with  a  user  input  option  (OFF)  (subsection  7.6.9). 

7. 5. 18  Chronological  Device  Utilization  Report  (File  26).  This  report 
provides  a  chronological  listing  of  the  six  most  active  disk  devices,  by 
device  number  and  their  probability  of  utilization  (see  figure  7-18).  This 
report  is  so  designed  that  any  time  quantum  can  be  set  in  the  report.  By 
varying  the  time  quantum  parameter,  the  user  may  select  integer  values  from 
1  to  n  (where  n  is  a  positive  value  in  seconds).  A  time  quantum  variation 
is  requested  with  a  user  input  option  (TIMEQ)  (subsection  7.6.14). 
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followed  by  many  accesses  to  device  3  followed  by  many  accesses  to  device  2 
followed  by  many  accesses  to  device  1,  etc.  Each  device  could  have  been  a 
bottleneck  for  a  subperiod  of  the  total  monitoring  period.  This  could  also 
have  been  the  case  if  the  proportionate  utilization  of  each  device  was 
equal.  TOie  Channel  Monitor  can  be  used  to  uncover  this  cyclic  type  of 
usage.  In  addition,  the  Chronological  Device  Utilization  Report  (see 
subsection  7.5*18)  was  designed  to  uncover  this  type  of  problem  by  breaking 
down  device  utilization  over  time,  rather  than  by  utilizing  a  histogram. 
Nevertheless,  when  a  single  or  small  number  of  devices  has  a 
disproportionately  large  share  of  the  accesses,  they  are  potential 
bottlenecks  and  their  usage  should  be  further  analyzed. 

This  report  will  show  all  connects  that  were  issued  to  a  given  device. 

This  includes  all  read/write  connects,  as  well  as  any  command  type  connects 
issued  to  a  given  device. (See  subsection  7«3)» 

This  histogram  can  report  a  maximum  of  50  devices.  If  a  site  is  configured 
[with  more  than  50  devices,  a  second  report  will  be  produced  as  a 
continuation  of  Report  1. 

7*5.21  Elapsed  Time  Between  Seeks  Report  (File  42).  This  is  a  histogram 
report  for  the  frequency  of  occurrence  of  elapsed  time  intervals  between 
the  issuance  of  mass  storage  access  connects.  Figure  7-21  presents  a 
sample.  The  elapsed  time  is  calculated  as  the  time  difference  between 
successive  mass  storage  connects  from  the  central  system  and  thus  is 
representative  of  the  workload.  It  does  not  provide  any  meaningful 
information  on  the  subsystem  service  capabilities. 

The  data  presented  give  the  count  (INDIV.  NUMBER)  and  percentage  (INDIV. 
PRC)  of  elapsed  time  between  accesses  which  fell  within  each  time  range. 

The  column  headed  TIME  MSECS  gives  the  time  range  in  milliseconds.  Thus, 
the  data  of  the  row  with  a  time  of  18  gives  the  count  and  fraction  of 
elapsed  time  intervals  in  the  range  of  17+  to  18  milliseconds.  The  columns 
headed  CUMUL.  NUMBER  and  CUMUL.  PRC.  give  the  accumulated  counts  and 
percentage  and  are  useful  in  describing  the  mass  storage  rates,  e.g.,  75-4 
percent  of  the  accesses  occur  less  than  21  ms  after  the  last  access. 

The  bottom  of  the  report  provides  a  statistical  summary  of  the  data  in  the 
report.  Statistics  given  include  average,  variance,  and  standard 
deviation.  These  statistics  apply  to  all  data  points  that  were  measured. 
The  statistics  concerning  OUT  OF  RANGE  are  for  those  data  points  which  fall 
outside  the  range  of  the  histogram.  OUT  OF  RANGE  points  are  included  in 
the  previous  statistics.  This  report  is  always  generated  and  cannot  be 
turned  off. 

7*5 .22  Data  Transfer  Size  Report  (File  42).  A  sample  histogram  report  on 
the  frequency  of  occurrence  of  sizes  of  the  data  blocks  transferred  between 
mass  storage  and  main  memoiy  is  given  in  figure  7-22.  Refer  to  subsection 
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7.5*12  for  a  description  of  the  histogram  format.  This  report  has 
increments  of  64  words,  and  the  number  in  the  column  headed  NUMBER  WORDS  ia 
the  upper  value.  The  occurrence  of  certain  data  transfer  sizes  should  be 
anticipated.  For  example,  64-word  blocks  are  used  for  catalog  accessing; 
in  other  parts  of  GCOS,  standard  system  format  is  320  words.  SSA  modules 
are  usually  slightly  less  than  51 2  words.  When  the  Timesharing  Subsystem 
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Figure  7-21.  Elapsed  Tine  Between  Seeks  Report 


7.6  Default  Option  Alteration 


Most  users  rely  upon  the  standard  MSN  Beport  formats  and  their  default 
values  as  these  suit  a  vide  range  of  needs*  A  capability  to  change  the 
reports  is  built  into  MSHDHP.  The  general  fox*  for  all  option  requests  are 
as  follows:  The  first  card  contains  an  action  code  describing  the  action 
to  be  taken.  Subsequent  cards  modify  report  parameters  for  some  of  the 
action  codes*  All  input  cards  are  free  format  with  the  only  requirement 
being  that  at  least  one  blank  apace  separates  multiple  input  parameters. 

The  very  last  input  card  muat  have  the  word  "EBD"  entered  in  it.  This  card 
must  be  present  whether  or  not  any  other  input  options  are  selected. 


There  is  no  specific  order  required  of  the  options,  and  multiple  entries  of 
each  are  permissible.  If  several  inputs  refer  to  the  same  report,  the  last 
one  encountered  will  have  precedence.  If  a  report  is  turned  off  by  default 
and  is  modified,  it  will  be  turned  on  through  the  request  for 
modification.  The  chart  below  shows  the  available  actions:  the  mnemonic 
code  for  the  user  to  identify  the  action;  the  function;  and  the  default. 


Mnemonic 


Junction 


Default  (indicated  in  parentheses) 


ABEA  Bequest  file  code  references  made  to  a  specific 

area  of  a  specific  device  (not  provided) 

DEBUG  Debug  (no  debug) 

EBBOB  Do  not  atop  on  Input  Error  (stop) 

FILDEF  Define  system  files  by  name  (no  names  used) 

END  This  card  must  be  present 

MODULE  Produce  the  SSA  Module  Usage  Beport  by  Job 

(no  report  produced) 

MCOHN  Process  a  limited  number  of  connects 

(total  tape  processed) 

MSEC  Process  a  limited  number  of  tape  records 

(total  tape  processed) 

OFF  Turn  reports  off  (all  reports  OB  except  reports  12,16, 

18,20  -  see  table  7-1) 

01  Turn  reports  on  (all  reports. on  except  reports  12,16, 

18,20  -  see  table  7-l) 

PBOJ  Produce  the  Connect  Summary  Beport  by  Userld/SBUMB 

(no  report  produced) 
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RM  This  option  must  be  selected  when  .KSMDRP  is  used  to 

process  WV6.4  data  or  when  MSMDRP  is  executed  on  a  W6.4 
system  (process  ¥¥7.2/4JS  data  on 

a  WV7.2/4JS  system) 

TINE  Set  a  timespan  for  measurement  (no  time  criterion) 

TIMEQ  Change  time  quantum  for  Chronological  Device 

Utilization  Report  (report  is  off  -  default 

value  is  60  seconds) 

USERID  Suppress  userids  from  reports  (userids  printed) 

RATECH  Change  time  quantum  for  Connects  Per  10  Minute  Report 

(report  is  off  -  default  value 
is  10  minutes) 

CAT  Turn  on  the  Cat/File  String  Report  (report  off) 

RATE  Request  the  Connect  Per  10  Minute  Report  for  specific 

user  jobs. 

LIMITS  Limit  the  amount  of  processing  performed  and  reports 
produced . 

ZERO  Process  jobs  with  an  activity  number  of  zero. 

7-6.1  Monitor  a  Specific  Device  Area  (Action  Code  AREA).  This  option 
allows  a  user  to  specify  specific  areas  of  a  device  for  which  all  jobs 
referencing  this  area  are  to  be  highlighted.  The  format  of  the  display  is 
that  of  a  File  Code  Summary  and  contains  those  jobs  and  file  codes  that 
reference  the  area  of  interest. 

The  device  to  be  investigated  is  identified  via  the  PUB  and  IOM  number. 

The  specific  areas  of  interest  are  identified  as  beginning  at  the  starting 
address  defined  in  llinks.  The  length  of  the  area  is  al3o  in  llinks,  with 
a  zero  meaning  the  end  of  the  device.  A  total  of  ten  possible  areas  are 
allowed.  The  format  for  this  card  is  shown  in  figure  7-25- 

See  subsections  7. 5* 12  and  7.5*16  for  complete  details  on  the  report  format 
generated  with  this  user  option.  This  report  is  off  by  default  and  will  be 
activated  by  the  processing  of  this  action  code. 

The  following  instructions  are  the  procedures  that  should  be  followed  when 
using  this  option; 

o  Determine  the  device  and  area  on  the  device  to  be  examined.  For 
purposes  of  thiB  example,  assume  we  want  to  analyze  cylinders 
300-420  on  device  5,  IOM-O,  PUB-8,  and  cylinders  200  to  the  end  of 
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device  7,  IOM-O,  PUB-8.  In  addition,  we  want  to  report  all  connecte 
issued  to  the  specified  area  of  device  5,  but  report  accesses  to 
specified  area  of  device  7  only  if  they  exceed  300  connects. 

o  Convert  the  cylinder  addresses  to  LLINKS.  For  purposes  of  this 
example,  assume  that  we  are  analyzing  430  disk  drives: 

cylinder  300  -  (300x760)  -  228000  sectors  -  (228000)/5  »  45600  LLINKS 

cylinder  200  -  (200x760)  -  152000  sectors  -  (l52000)/5  -  30400  LLINKS 

o  Determine  the  length  of  the  search.  For  device  5,  we  want  120 

cylinders,  and  for  device  7,  we  want  to  go  to  the  end  of  the  device. 

o  Convert  the  cylinder  lengths  to  LLINKS. 

120  -  (120x760)  -  91200  sectors  -  91200/5  -  18240  LLINKS 

The  length  for  the  second  search  is  zero  since  we  want  to  go  to  the 
end  of  the  pack. 

o  Fill  out  the  cards  following  the  format  on  figure  7-25: 

AREA 

2 

085  45600  18240  0 

087  30400  0  300 

7.6.2  System  Debug  (Action  Code  DEBUC).  This  is  a  restricted  option  for 
CNF  system  developers.  DEBUC  should  only  be  used  with  guidance  received  by 
CCTC/C751.  The  option  consists  of  the  word  DEBUC  on  the  first  card  and  any 
one  of  the  following  values  on  the  second  card: 

9999  -  perform  trace  logic  checking 

9998  -  perform  GEFSYE  debug 

9997  -  perform  accounting  debug 

9996  -  perform  SYSOUT  debug 

1-63  -  debug  this  program  number 

-1  -  d*'  g  connects  without  a  valid  seek  address 

-N  -  linkings  of  the  FC  array 


7.6.3 

ERROR) 


Reduction  After  an  Input  Option  Error  (Action  Code 
a  allows  data  reduction  to  continue  when  an  error  has 
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been  detected  end  reported  in  an  input  option  request.  The  default  value 
reports  the  error  and  aborts  the  data  reduction  procedures.  The  format  for 
this  option  is  the  word  ERROR  on  the  data  card. 

7*6.4  Specify  System  File  Names  (Action  Code  RILDEP).  This  option  allows 
the  user  to  specify  the  name  of  each  system  file  displayed  in  the 
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Card  1  -  A 
Card  2  -  N 

| Card  3-BCDEFG 
Where 

A  “  The  word  AREA 

N  *  The  number  of  areas  to  be  specified.  A  maximum  of  ten  areas  are 
permitted.  A  Card  #3  must  be  present  for  each  area  requestred. 

B  *  IOM  number 
C  ■  Pub  number 
D  ■  Device  number 
E  “  Starting  address  in  llinks 
F  3  Length  of  area  in  llinks 

IG  ■  Do  not  report  this  file  code  unless  number  of  connects  exceeds  this 
value 


The  following  definitions  apply 

to  this  option. 

Device 

Numbers 

Number  Sectors/ 

Number  Sectors/ 

Type 

Cylinders 

Cylinder 

Block  (LLINK) 

180 

200 

360 

5 

181 

200 

360 

5 

190 

407 

589 

5 

191 

407 

760 

5 

450 

811 

760 

5 

500 

811 

760 

5 

501 

840 

1280 

5 

Figure  7-25*  Specific  Device  Area  Report  Card  Input 
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System  File  Use  Summary  Report  discussed  la  subsection  7* 5* 9*  Alia  option 
la  opacified  vith  a  aat  of  data  cards.  The  firat  data  card  containa  the 
word  FILDEF.  The  aacond  data  card  containa  the  number  of  spates  files  to 
be  described  on  the  following  cards*  The  following  cards  each  contain  a 
single  pair  of  data  points  separated  by  at  least  one  blank*  The  first  data 
point  ia  the  spates  file  number  and  the  second  data  point  la  the  dealred 
spates  file  nase* 

The  standard  output  of  the  Spates  File  Use  Summary  Report  ia  to  label  each 
spates  file  aa  Spates  File  1,  Spates  File  2,  etc*,  corresponding  to 
G COS- HI -USE,  GCOS-LO-USE,  etc.  In  order  to  know  the  correct  order  of  the 
file  names.  the  user  should  check  the  $  FILES  section  of  the  startup  deck* 
The  order  of  the  files  in  the  $  FILES  section  of  the  startup  deck  la  the 
order  they  are  referenced  in  the  report. 

7*6*5  End  Card  (Action  Code  ERL) .  This  card  nuat  be  present  at  all  times 
and  suat  be  the  last  data  card  supplied.  It  consists  of  the  word  ERL 
entered  on  the  card. 


7*6.6  Produce  the  SSA  Module  Pa age  Report  by  Job  (Action  Code  MODULE) . 

This  option  allows  the  uaer  to  produce  the  SSA  Nodule  Usage  Report.  This 
report  will  Hat  every  SSA  module  uaed  bp  every  job  that  was  run  during  the 
sonitoring  session*  See  subsection  7*5*11  for  details  concerning  this 
report.  This  report  is  off  bp  default  and  cannot  be  turned  on  by  using  the 
OR  option.  Thla  report  can  be  activated  only  by  entering  MODULE  on  the 

data  card. 

v 

7*6*7  Record  Limitation  by  Connects  (Action  Code  RCONW).  This  option 
allows  a  user  to  process  only  a  specific  nusber  of  connects.  This  option 
is  especially  useful  if  the  tape  contains  an  error  on  it  and  cannot  be 
completely  processed.  Using  this  option,  the  user  can  process  the  tape  or 
tapes  up  to  the  point  where  the  tape  error  exists.  This  option  requires 
two  data  cards*  The  first  data  card  contains  the  word  HCOHR  with  the 
second  card  containing  the  nusber  of  connects  to  be  processed. 


7*6.8  Record  Limitation  by  Records  (Action  Code  RREC).  This  option  allows 
a  user  to  process  only  a  specific  nusber  of  tape  records.  This  option  is 
especially  useful  if  the  tape  contains  an  error  on  it  and  cannot  be 
completely  processed*  Using  this  option,  the  user  can  process  the  tape  or 
tapes  up  to  the  point  where  the  tape  error  exists*  This  option  requires 
two  data  cards*  The  first  data  card  contains  the  word  RREC  with  the  second 
card  containing  the  number  of  tape  records  to  be  processed. 

7*6*9  Turn  a  Report  Off  (Action  Code  OFF).  This  option  allows  a  user  to 
turn  s  report  off  that  is  on  by  default.  In  MSMDRP,  all  reports  are  on 
except  report  numbers  12.  16.  18  and  20  (see  table  7-1)*  Only  those 
reports  in  table  7-1  that  have  a  name  in  parentheses  ()  can  be  turned  off 
with  this  option.  Two  data  cards  are  required  to  use  this  option.*  The 
first  card  contains  the  word  OFF  and  the  second  csrd  contains  the  name  of 
the  report  as  displayed  in  the  parentheses  ()  in  table  7-1* 


I 
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7*6.10  Turn  a  Report  On  (Action  Code  OH).  Oils  option  allow*  a  uear  to 
turn  a  report  on  that  la  off  by  default.  In  MSKDRP,  all  report a  are  on 
azeapt  report  numbers  12,  16,  18  and  20  (aaa  table  7-l) .  Only  those 
reports  In  table  7-1  that  hare  a  nans  In  parentheses  ()  can  be  turned  on 
with  this  option  (#9,10,12, 15,16,17, 18, 20;.  Two  data  cards  are  required  to 
use  this  option.  The  first  card  contains  the  word  01  and  the  second  card 
contains  the  naae  of  the  report  as  displayed  In  the  parentheses  ()  in  table 
7-1. 

7.6.11  Produce  Connect  Suaaaiy  Report  by  Uaerid/SHUMB  (Action  Code  PBOJ) . 
This  option  allows  the  user  to  specify  up  to  a  total  of  40  USERIDs  and 
SHDKBa  for  which  he  wants  the  Connect  Suaaary  Report  by  Userid/SRUMB 

I  produced.  The  nuaber  of  SIUMBe  requested  cannot  exceed  10.  In  addition, 

|  the  user  can  request  the  entire  Pile  Code  Suaaary  Report,  or  the  user  may 
want  to  see  the  Pile  Code  Suaaary  Report  only  for  a  prespecified  set  of 
I  Jobs  or  USERIDs,  or  the  user  nay  not  want  the  Pile  Code  Suaaary  Report  at 
[  all.  Por  example,  the  user  can  request  35  different  USERIDs  and  5  SWUM Be 
or  40  different  USERID s  and  0  SIUMBe  or  30  different  USERIDs  and  10  SIUMBs 
or  3  different  USERIDs  and  6  SIUMBs,  etc.  The  fomat  for  this  option  is 
shown  In  figure  7-26.  If  values  of  xsro  are  desired,  they  oust  be  punched 
|  on  the  card.  A  blank  la  not  equivalent  to  a  zero .  The  Connect  Suaaary 
I  Report  will  Indicate  for  each  requested  USERID  or  SIUMB  the  number  of 
connects  aade  by  the  Job  or  USERID.  If  a  requested  SIUMB  also  has  a 
requested  USERID,  the  nuaber  of  connects  issued  by  that  job  will  be 
reported  twice  In  the  suaaary  report.  Refer  to  subsection  7.5.14  for  a 
i  description  of  the  report  to  be  produced  with  this  option.  If  the  user 
j  desires  to  ses  a  Pile  Code  Suaaary  Report,  It  will  be  turned  on  via  this 
1  option.  The  user  does  not  need  to  use  the  "ON"  Input  option. 

7.6.12  Reduce  WV6.4  Data  or  Process  HSMDBP  on  a  W6.4  System  (Action 
Code  RE).  This  option  requires  two  cards.  The  first  card  has  the  letters 
RN  and  the  second  card  one  of  the  following  numbers: 

1  -  W6.4/2H  system  processing  W6.4/2H  data 

2  -  W6.4/2H  systea  processing  W7.2/4JS  data 

3  -  W7.2/4JS  systea  processing  W6.4/2H  data 

The  default  la  a  W7.2/4JS  systea  with  WV7.2/4JS  data. 

7.6.13  Set  a  Tiaeapan  of  Measurement  (Action  Code  TIME).  The  timespan  of 
data  collection  can  cover  many  hours  of  which  only  a  few  nay  be  of 
interest.  This  option  allows  a  user  to  specify  the  timespan  (or  spans)  to 
displayed  In  all  reports.  Por  example,  the  user  may  specify  that  he  wants 
to  collect  data  from  0500  to  2200  and  wants  to  display  data  only  from  0900 
to  1700  in  all  reports. 

If  the  entire  reduction  will  have  a  set  timespan,  the  naae  "TOTAL"  is 
used.  Histograa  reports  cannot  be  individually  tlaespanned.  All  tlmespans 
of  "other"  reports  will  be  bounded  by  the  overall  report  timespan,  if  one 
will  be  used.  Up  to  five  tlaespans  for  each  report  type  may  be  specified. 


Card  1  -  A 
Card  2  ■  B  C  I 
Card  3+  ■  E 
Card  4*  ■  P 

Where 

A  »  The  word  PROJ 

B  *  0  if  Connect  Summary  Report  ia  desired,  but  no  File  Code  Summary 
Report  is  desired.  The  0  must  be  punched  on  the  card.  A  blank  is 
not  equivalent  to  a  0. 

B  ■  1  if  Connect  Summaxy  Report  ia  desired  and  a  complete  File  Code 
Summary  Report  is  wanted 

|  B  *  2  if  Connect  Summary  Report  ia  desired  and  only  a  partial  File  Code 
Summary  Report  is  wanted 
C  “  Number  of  Userids  (30  MAX) 

D  -  Number  of  SNUMBS  (10  MAX) 

E  «  A  total  of  C  Userids  with  one  Userid  per  card.  If  C  «  0,  this  card 
is  not  present. 

F  ■  A  total  of  D  SNUMBS  separated  by  at  least  one  blank  space. 

All  SNUMBS  should  fit  on  one  car'd.  If  D  *  0,  this  card  is  not 
present. 


|#  -  The  values  B,  C  and  D  must  be  separated  by  at  least  one  blank  column. 


Figure  7-26.  Limited  File  Code  Summary  Input  Card  Format 
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7.6.16  Change  the  Time  Quantum  Value  for  the  Connect  Per  10  Minute  Report 
(Action  Code  RATECH).  The  user  can  change  the  time  quantum  value  used  to 
produce  the  Connect  Per  10  Minute  Report  by  inputting  the  quantum  in 
seconde.  Two  cards  are  required.  The  first  card  contains  the  word  RATECH 
and  the  second  card  contains  the  new  quantum  in  minutes.  The  default  value 
is  10  minutes. 

7*6.17  Turn  on  the  Cat/File  String  Report  (Action  Code  CAT).  This  option, 
consisting  of  the  word  CAT  on  the  data  card,  will  turn  on  the  Cat/File 
String  Report  (see  subsection  7*5.13). 

7.6.18  Request  the  Connect  Per  10  Minute  Report  for  Specific  User  Job 
(Action  Code  RATE).  This  option  will  allow  the  user  to  obtain  the  Connect 
Report  for  specific  jobs  as  well  as  for  the  Timesharing  Subsystem  and  the 
Total  System  (see  subsection  7.5*24).  Card  number  1  contains  the  word 
RATE,  card  number  2  the  number  of  jobs  desired  (a  maximum  of  8  are 
permitted),  and  card  number  3  the  SNUMBs  of  the  jobs  desired.  In  addition 
to  the  requested  jobs,  the  Timesharing  Subsystem  as  well  as  the  Total 
System  will  also  be  reported.  If  multiple  copies  of  TSS  are  in  use,  all 
activity  will  be  reported  under  the  single  job  name  of  TS1.  If  the  user 
wants  to  obtain  this  report  for  only  Timesharing  and  the  Total  System,  then 
he  simply  needs  to  use  the  "ON"  input  option  using  the  name  "RATE"  for  the 
required  report  ID. 

7.6.19  Limit  the  Processing  and  Output  (Action  Code  LIMITS).  This  option 
will  allow  the  user  to  control  the  amount  of  output  produced  and  the  amount 
of  record  processing  performed.  Card  number  1  contains  the  word  LIMITS  and 

|  card  number  2  contains  either  the  word  ONLYSP,  NOHIST,  SUMARY,  UTIL,  or 

I  DEVICE.  If  the  word  ONLYSP  is  used  then  the  Mass  Store  Monitor  program 
will  process  only  those  data  records  that  are  generated  by  the  SNUMBs 
requested  under  the  RATE  input  option  (see  subsection  7-6.18).  All  other 
data  will  be  ignored.  The  user  must  take  care  when  examining  the 
histograms  and  reports  that  are  produced.  The  user  must  remember  that  only 
a  limited  amount  of  data  has  been  processed.  If  the  word  NOHIST  is  used 
then  no  seek  or  space  utilization  histograms  will  be  produced.  This  option 
can  be  used  in  conjunction  with  the  ONLYSP  option  (must  have  two  LIMITS 
input  cards)  or  can  be  used  by  itself.  In  the  latter  case,  all  data  will 
be  analyzed,  but  no  histograms  will  be  produced.  Finally,  the  user  can 
request  a  summary  of  the  seek  movement  activity.  He  can  obtain  this 
summary  whether  or  not  he  selects  to  produce  the  set  of  histograms.  To 
obtain  the  summary  report,  he  must  type  SUMARY  on  a  card  immediately 
following  the  LIMITS  card.  A  summary  listing  will  not  be  produced  for  the 
space  histograms,  as  this  summary  information  is  meaningless  for  this  set 
of  histograms.  If  the  word  UTIL  is  used,  then  only  the  Device  Utilization 
by  Device  Number  Report  will  be  produced.  This  option  will  significantly 
reduce  the  run  time  of  the  MSMDRP  and  allow  the  user  to  determine  those 
devices  with  high  utilization.  Once  these  devices  have  been  identified, 
the  user  can  run  a  second  job  using  the  DEVICE  option.  If  the  word  DEVICE 
is  used,  the  Data  Reduction  Program  will  analyze  only  the  specific  unique 
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(device  IDs  requested  on  the  following  card.  If  this  option  is  used,  the 
third  card  contains  the  number  of  unique  devices  to  be  checked  (not  to 
exceed  20)  and  the  fourth  card  contains  a  list  of  unique  device  ID  numbers 
obtained  from  the  Device  Utilization  Histogram. 

7.6.20  Zero  Activity  Processing  (Action  Code  ZERO).  This  option  allows 
the  user  to  define  up  to  10  jobs  which  the  user  desires  to  see  handled  as 
normal  activities,  even  though  they  process  with  an  activit”  number  of 
zero.  Under  normal  conditions,  any  activity  processing  witn  an  activity 
number  of  zero  will  be  considered  a  System  Scheduler  Job  (see  subsection 
7.5.15).  To  use  this  option,  the  first  data  card  should  have  the  ord 
ZERO.  1316  second  data  card  contains  the  number  of  jobs  following  on  the 
third  card.  This  number  may  not  exceed  10.  The  third  card  contains  the 
list  of  SNUMBs,  separated  by  at  least  one  blank  column. 

7.7  JCL 

The  data  reduction  procedures  consist  of  a  single  FORTRAN  program  having  a 
main  level  and  multiple  subroutines. 

A  description  of  the  more  important  JCL  cards  is  presented  below  (see 
figure  7-28) . 

The  $:LIMITS  card  should  be  studied  to  meet  user  needs.  The  run  time  (99) 
and  output  limit  (30K)  may  both  need  to  be  altered  as  required  by  the 
Jduration  of  the  monitoring  run.  The  MSMDRP  requires  73K  of  memory  in  order 
to  execute  plus  an  additional  2K  for  SSA  space.  During  the  initial  loading 
| process,  MSMDRP  will  actually  require  81K  of  memory,  but  10K  will  be 
released  immediately  upon  loading. 

The  statement: 

$  DATA  I* 

is  used  to  identify  the  data  cards  that  follow  as  described  in  subsection 
7.6.  At  least  one  data  card  is  required,  that  being  an  "END"  request. 

7.8  Multireel  Processing 

If  more  than  a  single  reel  of  data  has  been  collected,  a  series  of  messages 
will  be  outputted  to  the  console  informing  the  operator  that  a  new  data 
reel  is  required.  The  following  are  the  messages  produced. 

a.  DISMOUNT  REEL  #mXX  THEN  MOUNT  REEL  NUMBER  YYYYY  ON  ZZZZZ 

In  this  case,  XXXXX  is  the  old  reel  number,  YYYYY  is  the  new  reel 
number,  and  ZZZZZ  is  the  tape  drive  ID. 
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Col  1  8 

i  IDENT 

$  SELECT 

t  TAPE 

||  LIMITS 

$  DATA 

(The  following  la  a  suggested  set  of  data  cards) 

ON 

RATE 

CAT 

ON 

CHRONO 

LIMITS 

SUMARY 

END 

$  END JOB 


16 

1820251/30/3044 
B29IDPX0/0BJECT/MSM 
01,X1D, ,12345 
99,73K,-4K,3CK 

I* 


Figure  7-28.  MSMDRP  JCL 
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If  the  operator  fails  to  mount  the  new  tape,  the  above  message 
will  be  repeated  three  times,  after  which  the  program  will 
terminate,  and  all  reports  produced. 

b.  IS  TAPE  XXXXX  MOUNTED  ON  DRIVE  ID  YYYYY  (Y/N) 

In  this  case,  XXXXX  is  the  tape  number  being  requested  fcr 
mounting  and  YYYYY  is  the  tape  drive  ID. 

This  message  occurs  when  the  data  reduction  program  finds  the 
wrong  tape  has  been  mounted  (by  comparing  internally  generated 
tape  labels).  If  the  operator  answers  N,  the  message  in  (c)  below 
is  produced.  If  the  operator  answers  Y,  the  data  reduction 
program  will  terminate  and  all  reports  will  be  produced.  In  this 
case,  the  data  reduction  program  is  unable  to  process  the  tape. 
Even  though  the  operator  is  mounting  the  correct  tape,  the 
internal  label  on  the  new  tape  does  not  match  that  being  requested 
by  the  old  tape.  The  user  should  check  the  data  collection 
session  to  insure  that  the  operator  did  not  respond  with  an 
incorrect  tape  number  during  multireel  change. 

After  entering  the  Y  or  N,  the  operator  will  need  to  hit  the  EOM 
key  twice  in  order  for  the  response  to  be  transmitted. 

c.  WRONG  REEL  JUST  MOUNTED,  DISMOUNT  AND  MOUNT  REEL  XXXXX  ON  ZZZZZ 

In  this  case,  XXXXX  is  the  new  reel  number,  and  ZZZZZ  is  the  tape 
drive  ID. 

d.  CAN  TAPE  XXXXX  BE  MOUNTED  ON  DRIVE  YYYYY  (Y/N) 

In  this  case,  XXXXX  is  the  new  desired  reel  and  YYYYY  is  the  tape 
drive  ID. 

If  the  operator  fails  to  answer  this  message  it  will  be  repeated 
until  he  responds  with  a  MY"  for  YES  or  "N"  for  NO.  If  he  types 
in  "Y",  then  message  (a)  will  be  repeated.  If  the  types  in  "N", 
then  the  program  will  be  terminated  and  all  reports  will  be 
produced . 

7.9  Tape  Error  Aborts 

During  the  course  of  processing  it  is  possible  that  the  operator  will  be 
required  to  abort  the  data  reduction  program  due  to  an  irrecoverable  tape 
error.  If  such  a  condition  occurs,  the  operator  should  abort  the  job  with 
a  "U"  abort.  This  will  allow  the  data  reduction  program  to  enter  its 
wrap-up  code  processing  and  produce  all  reports  generated  prior  to  the  tape 
error. 
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$  XBAR  IOM-1 , PUB-14 ,  IOM-O ,  PUB-14 

I OH-2 ,PUB~14, IOM-2 , PUB-13 , 

IOM-O , POB-1 3 , IOM-1 , PUB-13 

MPC  Cards 

rMPC-0  PSI-2 , IOM-0 , PUB-10 , PSI -1 , IOM-1 , PUB-10 , 

PSI-3 , IOM-2 , PUB-10 ,PSI-3 , IOM-2, PUB-11, 
PSI-1 , IOM-1 ,PUB-11 , PSI-2 , IOM-O , PUB-11 

$  MPC-2  PSI -0, IOM-1, PUB-14, PSI-2, IOM-O, PCB-14 

PSI-1 , IOM-2 ,PUB-14, PSI-1, IOM-2 , PUB-13 
PSI-0 , IOM-O , PUB-13 , PSI-2 , IOM-1 , PUB-13 


Chart 


I OM/ Channel 

MPC/PSIA 

IOM/ Channel 

MPC/PSIA 

0-10 

0-2 

1-14 

2-0 

1-10 

0-1 

0-14 

2-2 

2-10 

0-31* 

2-14 

2-1 

2-11 

0-^ 

2-13 

1-12** 

1-11 

0-1 

0-13 

i-o3 

0-11 

0-2 

1-13 

1-2 

*  problem  1 
**  problem  2 

The  problems  described  by  the  above  procedures  could  be  solved  by 
redesigning  the  crossbar  cards  in  the  following  manner: 

$  XBAR  IOM-O, PUB-10, IOM-1, PUB-10, 

IOM-2 , PUB-10 , IOM-O , PUB-11, 

IOM-1 , PUB-11 , IOM-2 , PUB-11 

$  XBAR  IOM-1, PUB-1 4, IOM-O, PUB-14, 

IOM-2 , PUB-14 , IOM-1 , PUB-13 , 

IOM-O , PUB-1 3 , IOM-2 , PUB-13 

If  the  I/O  request  cannot  be  granted,  because  either  the  channel  or  device 
being  requested  is  currently  busy,  the  request  will  be  queued.  This 
request  will  only  be  serviced  when  both  a  channel  and  device  are  free. 

When  queuing  occurs  for  a  channel,  GCOS  will  indicate  the  request  queued 
over  the  primary  channel.  A  primary  channel  is  that  channel  which  appears 
first  on  the  SXBAR  card,  for  a  given  string  of  devices.  Therefore,  all 
channel  queue  histograms  are  presented  only  for  primary  channels.  However, 
a  queue  on  the  primary  channel  actually  means  that  all  channels,  both 
physical  and  logical,  connected  to  the  desired  device  were  busy.  When  the 
request  is  finally  granted,  a  trace  type  7  is  issued. 
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A  table  is  used  to  hold  the  device  number,  channel  number,  IOM  number,  I/O 
queue  entry  address,  and  the  time  the  T22  trace  event  occurred.  With  the 
occurrence  of  each  T22  event,  the  table  entry  is  filled  to  mark  the  linking 
of  the  I/O  requests.  At  this  time,  the  required  computations  for 
determining  the  channel  and  device  queue  length  are  made.  Channel  queue 
histograms  are  produced  for  both  tape  and  mass  store  devices,  while  device 
queue  histograms  are  produced  only  for  mass  storage  devices.  The  channel 
and  device  queue  time  also  begins  at  this  point  and  will  be  updated  with 
the  occurrence  of  the  trace  7  event  for  this  I/O  request. 

With  the  eventual  occurrence  of  the  trace  7  event  for  the  I/O  request, 
several  updates  are  required  to  the  common  tables.  The  I/O  queue  time  data 
are  generated  for  the  channel  and  device  and  collected  for  the  appropriate 
histograms.  It  should  be  noted  that  it  is  possible  for  a  device  or  channel 
to  show  no  queuing,  but  yet  they  will  display  I/O  queue  time  in  their  queue 
time  histograms.  The  reason  for  this  is  that  the  queue  time  histogram  is 
reporting  the  time  difference  between  a  T7  trace  and  a  T22  trace.  The  T7 
trace  will  not  be  issued  until  the  actual  I/O  is  initiated  (i.e.,  a  logical 
channel  and  the  device  become  available).  The  following  scenarios  will 
attempt  to  illustrate  this  point: 

Scenario  1 


o  Seven  devices  are  configured  over  two  logical  channels. 

o  Device  1  receives  a  connect.  At  this  point  in  time,  device  1  is 

busy  and  1  logical  channel  is  busy. 

o  Device  2  receives  a  connect.  At  this  point  in  time,  devices  and 

2  are  both  busy  and  both  logical  channels  are  busy. 

o  Device  3  receives  an  I/O  request.  Device  3  is  not  currently  busy 
and  therefore  this  I/O  should  be  able  to  be  initiated.  However, 
there  are  no  available  channels  and  therefore  the  I/O  must  be 
delayed.  Note  that  the  queue  length  histogram  for  device  3  will 
report  a  0  queue  length,  but  may  indicate  a  substantial  queue  time. 

o  Device  3  receives  3  additional  I/O  requests  (4  outstanding 

requests  exist  at  this  point).  The  Channel  Monitor  will  still 
report  a  queue  length  of  0  for  device  3»  The  queue  at  device  3 
exists  not  because  of  the  device,  but  rather  because  of  a  channel 
shortage.  The  queue  time  histogram  for  device  3,  however,  will 
show  a  substantial  queue  time. 

o  Device  1  completes  its  I/O  and  device  3  initiates  one  of  its 
waiting  I/Os. 

o  Device  3  receives  another  I/O  request. 
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At  this  point  in  time,  device  3  ie  busy  and  has  3  additional 
requests  queued  for  it.  Therefore,  the  C31DRP  will  now  report  a 
queue  length  of  three.  Thus,  we  see  that  device  queues  may 
increase  in  a  nonsequential  fashion. 

Scenario  2 

o  A  channel  queue  of  length  0  exists  when  there  is  at  least  1 
nonbusy,  primary /logical  channel. 

o  When  all  primary  /logical  channels  become  busy,  the  length  of  the 
channel  queue  is  calculated  by  summing  the  length  of  all  device 
queues  on  that  channel. 

o  As  an  example,  assume  we  had  two  channels  configured  with  three 
devices.  If  device  1  became  busy  and  4  more  requests  were  made 
for  that  device,  we  would  have  a  device  queue  of  4  and  a  channel 
queue  of  0.  If  during  the  same  time,  device  2  received  3 
requests,  we  would  generate  a  queue  length  2  for  that  device  but 
the  channel  queue  would  still  remain  at  0.  It  should  be  noted, 
however,  that  even  though  no  channel  queueing  is  being  reported, 
the  channel  queue  time  histograms  will  show  significant  queue 
times.  This  is  because  of  the  delay  between  the  request  for  the 
I/O  (trace  22)  and  the  actual  connect  (trace  7).  In  the  situation 
described  thus  far,  device  contention,  not  a  shortage  of  channels, 
is  the  problem.  If  we  now  assume  that  a  connect  was  issued  to 
device  3,  we  have  now  created  a  channel  problem.  There  is  no 
available  channel  for  the  I/O  request.  At  this  point,  a  channel 
queue  of  7  (4  requests  for  device  1  *  2  requests  for  device  2*1 
request  for  device  3)  would  be  generated.  Thus,  we  see  that 
channel  queues  may  increase  in  a  nonsequential  fashion. 

Even  if  a  channel  is  available,  upon  the  occurrence  of  a  T22  trace,  several 
milliseconds  might  pass  before  the  system  generates  the  T7  event.  This  is 
especially  true  on  a  very  busy  system.  A  connect  queue  entry  is  now  filled 
with  data  to  be  used  for  the  I/O  service  time  histograms.  This  connect 
queue  holds  the  IOM,  channel,  device  number,  and  the  time  of  the  trace  7 
event.  The  channel  and  devices  status  table  entries  are  also  marked  busy 
at  this  point.  As  a  confidence  test,  the  channel  status  is  sensed  at  the 
start  of  processing  for  each  trace  7  event  for  a  nonbusy  status.  If  it  is 
busy,  a  lost  interrupt  is  considered  to  have  occurred  since  it  is 
impossible  for  a  connect  to  be  issued  to  a  busy  device  or  channel.  Device 
access  histogram  data  and  an  IOM  command  execution  count  are  also  generated 
at  trace  7  event  time. 

The  next  logical  event  for  the  I/O  process  is  the  termination  interrupt 
originated  by  the  IOM  at  the  I/O  data  transfer  completion.  The  signal  for 
this  event  ie  transmitted  by  the  IOM  to  the  processor  through  the  SCU  as  a 


request  for  the  processor  to  service  the  I/O  completion.  The  type  4  trace 
event  contains  the  IOM  number  and  channel  for  I/O  termination.  These  data 
are  used  to  determine  the  I/O  service  time  by  finding  the  time  difference 
of  the  connect  event  and  the  terminate  event.  The  time  difference  is 
collected  and  displayed  in  histogram  form  for  each  mass  store  and  tape 
channel  as  veil  as  for  all  mass  store  devices.  The  channel  and  device 
queue  length  are  also  adjusted  at  this  point  to  reflect  the  absence  of  a 
queue  being  serviced  for  this  channel  and  device. 

It  must  be  noted  that  exceptions  to  the  normal  I/O  process  are  to  be 
expected  and  must  be  accounted  for  in  the  reduction  program.  All  the 
exceptions  encountered  so  far  have  been  diagnosed  and  coding  in  the  program 
will  allow  for  exceptions.  Some  of  the  exceptions  include  the  following: 
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o  System  program  that  avoid  tha  722  tract  by  generating  their  owr, 
queue  entry  and  by  etarting  the  connect  imediately. 

o  Sy stm  program  that  Manipulate  the  I/O  priority  by  linking 

th— eel ves  ahead  of  I/O  requests  already  in  the  queue. 

o  I/O  requests  for  device  zero  (H>C)  which  makes  a  channel  busy  but 
not  a  device. 

o  Lost  interrupts  from  an  I/O  connect  which  leave  the  connect  table 
and  status  table  in  an  active  state  forever,  if  not  detected. 

o  Special  controller  eownanda  which  do  not  involve  the  channel  or 
device. 

8.4  Data  Reduction  Methodology 

The  CMDRP  currently  uses  random  I/O  (Pile  58)  to  process  histogram  data. 
This  feature  allows  the  CMDRP  to  process  an  unlimited  number  of 
channels/devices  with  a  minor  increase  in  memory  requirements,  as 
delivered,  the  CMDRP  will  process  75  mass  storage  devices  and  40  mass 
storage/tape  channels.  It  will  produce  106  unique  histograms  with  a 
minimum  amount  of  random  I/O.  If  the  number  of  channels  or  devices  is 
insufficient,  the  user  will  need  to  edit  file  B29IEPXD/SOURCE/CM.  The  user 
should  enter  the  edit  subsystem  and  process  the  following  camnand: 

B  RS:/NRDEVXX»XX75/; *:/NRDEWQC*XX  New  number  of  devices/ 

B  RS : /NRCHANXX-XX40/ ; * : /NRCHANXX-XX  New  number  of  channels/ 

For  each  additional  channel  the  size  of  the  program  will  increase  by  55 
words  and  for  each  additional  device  the  program  will  increase  by  25 
words.  In  the  above  edit  the  character  "X"  signifies  a  single  space. 

The  last  variable  that  will  need  to  be  changed  is  RPTCNT.  This  number 
represents  the  total  number  of  histograms  that  will  be  processed  with 
negligible  random  I/O.  TO  calculate  the  total  number  of  histograms  that 
will  be  produced  under  your  configuration,  the  following  formula  should  be 
used. 

(number  of  mass  store  devices) *3  +  (number  of  tape  and  mass  store 
channels  -  both  logical  and  physical)  -t  (number  of  tape  and  mass  store 
physical  channels) *2. 

If  this  value  is  less  than  106  no  change  is  required.  If  the  value 
computed  is  greater  than  106,  the  user  may  alter  this  value.  This  will 
help  to  decrease  CPU/10  time  but  will  increase  storage  by  80  words  for  each 
increment  above  106.  This  tradeoff  between  CPU/ 10  time  and  memory  must  be 
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made  at  the  discretion  of  the  user.  In  order  to  change  this  value,  the 
following  edit  function  should  be  performed: 

B  RS:/RPTCNTX»106/;#:/RPTCNTX«New  value 

As  in  the  earlier  edit  example,  the  character  "X”  should  not  be  typed,  but 
is  being  used  to  represent  a  blank  column.  After  performing  the  above 
edits,  the  user  should  recompile  the  source  program  by  entering  the  card 
subsystem  and  issuing  a  run  command. 

8.5  CMDRP  Output 

Reports  generated  from  the  Channel  Monitor  may  vary  from  one  collection 
period  to  another  due  to  the  difference  in  configuration  of  the  hardware. 
Report  numbers  are  preassigned  to  the  histogram  reports  which  are  hardware 
independent  and  are  dynamically  assigned  to  histograms  which  denote  the 
channel  and  device  uniquely  configured  on  each  IOM. 

The  following  subsections  will  describe  all  reports  produced  by  the  CMDRP 
and  subsection  8.6  will  describe  the  user  input  options. 

8.5.1  System  Configuration  and  Channel  Usage  Report  (File  57).  This 
report  documents  the  system  identification,  configuration,  and  the  date  and 
time  of  the  monitoring  period,  as  well  as  reporting  the  usage  of  all 
configured  I/O  channels.  Figure  8-2  is  an  example  of  this  report.  The 
heading  line  indicates  the  software  version  number  that  corresponds  to  this 
(document.  The  version  number  should  be  09-8?  CHG-7.  The  first  line  after 
the  heading  provides  the  tape  number(s)  the  report  was  generated  from,  the 
system  identification,  the  date  (in  the  form  year,  month,  and  day  - 
YYMMDD),  and  the  start  and  stop  times  (HH:MM:SS)  of  the  MONITORING 
SESSION.  The  next  several  lines  of  output  describe  the  overhead  of  all  GMF 
monitors  that  were  active  during  data  collection.  The  monitor  name  is 
given,  its  CPU  time  in  seconds,  and  its  overhead  as  a  function  of  total 
processor  power.  The  GMF  executive  overhead  is  separated  from  the  actual 
monitors  and  is  listed  as  "EXEC".  The  monitor  "NAME"  is  an  area  of  code 
within  the  Mass  Store  Monitor  and  even  though  listed  separately  it  is  also 
included  under  the  monitor  "MSM".  The  monitor  "FMS"  is  also  an  area  of 
code  within  the  Mass  Store  Monitor,  but  in  this  case  it  has  not  been 
included  under  the  monitor  "MSM". 

Monitor  "CM"  in  this  report  describes  the  processor  overhead  of  subroutine 
T4  (terminate  processing)  and  subroutine  T22  (start  I/O  processing). 

Monitor  "MSM"  in  this  report  describes  the  processor  overhead  of  subroutine 
T7  (connect  processing).  Therefore,  if  the  Channel  Monitor  was  active,  but 
the  Mass  Store  Monitor  was  not,  this  report  will  still  list  both  "CM"  and 
"MSM"  as  contributing  to  the  processor  overhead.  The  total  Channel  Monitor 
overhead  will  be  found  by  adding  the  overhead  of  the  "CM"  monitor  t>o  the 
overhead  of  the  "MSM"  monitor,  to  the  overhead  of  the  "FMS"  monitor. 
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Figure  8-2.  System  Configuration  and  Channel  Usage  Report 


0-09  .DS450  SEE  ABOVE  13131 
0-10  .DS450  SEE  ABOVE  13129 
0-H  .DS450  SEE  ABOVE  13135 
0-12  .DS191  0-13  189 

1-12 


co 


00 

c-  m  in 

m  ^  o 

o 

H 

c-  Q 

w 

CM 

CM  O  O' 

* 

H 

rH  ^ 

% 

o 

o 

OO'O'tCOOM/M/M'Nt'®' 

•^rr\(^f^co<T''^i-<OmLrvLr> 

H  H  H  H  r-t 


O 

£$ 

hBOh 
W 
CO 


m 

r~i  i 

I  I  I 


o 

^$ 


I  w 

w 

co 


t~-  vo 

I  I 


W 

CP 


g £££££££ S £  £ 

CP - - 

co 


*$$ 


o  o  o  o  o  o 


o 

e 


$$ 


$$$ 


WWHWWWWWWW 

WWWWMMKIWMU 

COCOCOCOCOCOCOCOCOCO 


§ 


CTV  O  O 
*  O  O 
E-i  trv  irv 
x  w  w 
«  «  « 


OOOOHHO'U'OOO'U 

HinKMfwnoiovssooas 

P<'tvC'Vj,V)-rlr-l|HFLAUVBEl 

^COCOCOCOCOCOSeSECOCOJESE 

f<qq«pBOPRPPPP 


Figure  8-2.  (Part  2  of  2 


If  both  the  Channel  Monitor  and  Mass  Store  Monitor  were  active,  then  the 
combined  overhead  of  both  monitors  can  be  found  as  the  sum  of  “MSM"  +  ”CM 
♦  "FMS ". 


For  purposes  of  this  report,  %  overhead  is  computed  as: 

(CPU  TIME  Used  by  Monitor) 

(TOTAL  Elapsed  Tlme)x( Number  of  Processors) 

Following  the  overhead  description  are  three  lines  of  configuration 
information  describing  the  number  of  processors,  IOMs,  and  amount  of  memory 
configured  to  the  system.  In  addition,  the  size  of  CCOS  Hard  Core,  the 
size  of  the  Core  Allocator  and  the  size  of  FILSYS  is  also  presented.  The 
third  line  of  the  configuration  data  indicates  the  number  of  processors 
actually  configured  and  actually  available.  These  numbers  might  be 
different  than  shown  on  the  first  line  due  to  the  assigning  and  releasing 
of  processors.  In  figure  8-2,  we  see  that  one  processor  was  released  for  a 
J period  of  time  (i.e. ,  CPUs  actually  available  is  equal  to  2.50).  The 
actual  time  that  processors  were  available  or  released  is  indicated  in  the 
status  message  printouts  (see  subsection  8.5.15). 

The  next  portion  of  the  report  documents  the  channel  configuration  by  IOM, 

| listing  each  configured  channel  number  (both  tape  and  disk),  the  device 
type  configured  to  that  channel,  and  the  channel  crossbarring.  The 
crossbar  column  shows  those  channels  that  are  crossbarred  to  the  channel 
identified  under  the  channel  column.  If  SEE  ABOVE  is  found,  the 
crossbarring  has  been  displayed  on  a  preceding  channel.  The  I-CC  format  of 
each  channel  description  identifies  the  IOM  and  the  channel  number  being 
discussed.  The  last  column  of  this  report  displays  the  number  of  all 
connect  types  issued  over  that  channel.  This  section  will  be  repeated  for 
each  IOM  configured  to  the  system.  This  report  is  always  generated  and 
cannot  be  turned  off. 

8.5*2  System  Summary  Report  (File  57) »  The  System  Configuration  and 
Channel  Usage  Report  and  the  System  Summary  Report  may  be  used  to  assess 
overall  system  utilization.  Figure  8-3  is  an  example  of  the  System  Summary 
Report.  The  first  set  of  lines  shows  the  number  of  connects  to  the 
monitored  mass  storage  subsystems  compared  to  the  total  connects  issued 
(TAPE+DISK)  and  the  connect  rate  per  hour  over  the  subsystem.  Most  systems 
will  show  a  small  number  of  Control  Connects  being  generated  by  the  MPCs 
configured  to  the  system.  These  Control  Connects  will  be  summed  together 
and  listed  as  a  separate  subsystem  line.  Analysis  on  a  Shared  Mass  Storage 
System  shows  the  number  of  MPC  connects  generated  to  be  a  significant 
percentage  of  the  total  connects  generated.  The  next  lines  show  the 
breakdown  of  the  mass  storage  and  tape  connects  by  the  IOM  over  which  they 
were  issued.  The  final  part  of  this  report  is  a  list  of  the  commands 
(octal  code  and  mnemonic)  issued  to  the  mass  storage  subsystem  and  the 
count  of  each  issued  during  the  monitoring  session.  This  report  is *always 
generated  and  cannot  be  turned  off. 


{ 
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A  well  performing  system,  under  a  heavy  workload,  should  show  a  high 
utilization  of  the  configured  resources.  Figure  8-3  shows  that  the  I/O 
|  activity  is  evenly  divided  between  the  MS0450  and  MS0500  subsystems.  All 
signs  indicate  that  if  system  throughput  is  being  affected  by  disk 
|  activity,  then  the  MS0450s  and  MS0500s  would  be  the  probable  cause  of  such 
problems. 

The  next  item  to  check  on  these  two  reports  should  be  the  channel  usage. 

The  two  highest  used  logical  channels  of  any  subsystem  should  be  on  a 
separate  PSIA  channel  of  a  two-PSIA  channel  subsystem  (see  subsection 
8.3),  Referring  to  figure  8-2,  one  can  see  that  logical  channel  8  of  IOM  0 
and  ION  1  has  the  highest  usage,  and  this  is  the  proper  configuration 
(refer  to  detailed  description  in  subsection  8.3),  If  the  highest  used 
logical  channels  are  not  on  separate  PSIA  channels,  the  $  XBAR  card  in  the 
startup  configuration  section  is  suspected  as  the  cause.  The  channels  are 
used  in  the  order  given  on  the  $  XBAR  card  (i,e>,  if  the  primary  channel  is 
busy,  the  next  channel  tried  is  given  on  the  crossbar).  The  alternate  use 
of  PSIA  channels  for  maximum  simultaneity  must,  therefore,  be  appropriately 
specified  in  the  boot  deck, 

Vhile  looking  at  the  System  Summary  Report,  it  is  also  of  interest  to  note 
|  the  ratio  of  READ  commands  to  WRITE  commands  (nearly  four  to  one  in  this 
example).  This  gives  an  indication  of  the  nature  of  the  usage  of  the  mass 
storage  space,  A  quick  look  at  the  number  of  write/verify  (WR-VER) 
commands  executed  is  also  of  interest  as  they  are  essentially  double 
(WRITE,  then  READ)  data  transfer  commands  which  require  more  device  and 
channel  time. 

The  general  fraction  of  utilization  for  each  logical  channel  gives  an 
indication  of  the  degree  of  simultaneity  of  access  to  the  subsystem.  If 
only  N  of  the  configured  logical  channels  have  nonzero  counts,  then  there 
were  never  more  than  N  accesses  being  performed  simultaneously  by  the 
subsystem.  The  proportional  relationships  among  the  counts  of  accesses 
made  over  each  of  the  logical  channels  are  quantitative  indications  of  the 
frequency  of  occurrence  of  specific  levels  of  simultaneity.  As  an  example, 
if  we  look  at  figure  8-2,  we  see  that  all  of  the  logical  channels  for  a 
given  disk  subsystem  are  receiving  nearly  an  identical  number  of  connects. 
This  would  indicate  that  a  significant  number  of  times  all  of  the  logical 
channels  were  busy  and  the  degree  of  simultaneity  was  very  high.  In  this 
example,  channel  queuing  (i.e.,  shortage  of  channel  power)  could  very  well 
be  a  problem.  This  is  not  to  infer  that  device  queuing  is  not  also  a 
problem,  just  that  channel  queuing  may  be  a  problem.  If  the  number  of 
accesses  to  the  lowest  priority  channel  is  a  smaller  percentage  of  the 
total  accesses,  then  channel  queuing  will  probably  not  need  to  be  examined. 
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8.5*3  System  Traces  Captured  by  Monitor  Report  (File  57).  This  report 
contains  the  number  of  occurrences  of  each  specific  trace  type  recorded  on 
the  data  collector  tape  processed  by  the  CMDRP  (figure  8-4).  This  report 
provides  little,  if  any,  information  required  by  the  user  for  his 
analysis.  This  report  is  always  generated  and  cannot  be  turned  off. 

8.5«4  Channel  Status  Changes  Report  (File  57).  This  report  lists  the 
initial  status  for  all  tape  and  disk  channels  configured  to  the  system 
(figure  8-5).  If,  during  the  course  of  the  monitoring  session,  a  given 
channel  or  IOM  was  dropped  or  added  to  the  system  (dynamic  reconfiguration) 
a  new  report  will  be  produced  indicating  the  activation  or  deactivation 
changes  and  the  time  that  the  change  occurred.  Finally,  this  report  will 
indicate  whether  the  SSA  cache  option  and  JWS  cache  option  are  active,  and 
if  so,  will  indicate  their  initial  status  and  any  changes  that  occur  to 
that  status.  If  a  given  option  is  not  active,  a  zero  will  be  reported  for 
each  of  the  values.  This  report  is  always  generated  and  cannot  be  turned 
off. 


8.5-5  Physical  Device,  Device  ID  Correlation  Table  (File  57).  Each  mass 
storage  device  configured  in  the  system  is  listed  with  a  unique  device  ID. 

A  typical  report  is  presented  in  figure  8-6.  This  unique  device  is  needed 
since  different  devices  can  have  the  same  device  number  on  the  Honeywell 
16000.  (See  Device  ID  24,  Device  ID  33,  and  Device  ID  49  of  figure  8-6). 
These  unique  numbers  are  referenced  in  several  reports  produced  by  the 
CMDRP.  This  report  is  always  generated  and  cannot  be  turned  off. 

8.5*6  Channel  Statistics  Report  (File  57).  The  Channel  Statistics  Report 
is  actually  a  series  of  reports  used  to  summarize  the  queuing  that  occurred 
over  the  channels  and  devices.  These  reports  are  processed  as  a  group  and 
are  produced  by  default.  The  entire  series  of  reports  can  be  turned  off 
via  the  use  of  the  "OFF"  input  option  (see  subsection  8.6.5). 

8. 5*6.1  Channel  Busy  and  Device  Busy  Report.  The  Channel  Busy  and  Device 
Busy  Report  is  given  in  figure  8-7.  These  data  are  collected  into  an  array 
during  execution  of  the  reduction  program  and  indicates  the  number  of  times 

I  channel  K  and  device  N  were  both  busy  (connects  had  been  issued  and  were 
currently  in  processing)  at  the  I/O  request  link  time.  Remember  that  this 
report  is  presented  by  primary  channel.  The  primary  channel  is  busy  when 
all  logical  channels  associated  with  the  primaxy  channel  are  busy.  As  an 
example,  figure  8-7  would  indicate  a  potential  channel  shortage  for  channel 
16  IOM  0  and  a  potential  device  contention  problem  for  devices  9,  11  and  13. 

8. 5* 6. 2  Channel  Busy  and  Device  Free  Report.  The  Channel  Busy  and  Device 
Free  Report,  figure  8-8,  is  generated  in  a  similar  method  to  the  Channel 
Busy  and  Device  Busy  Report.  If  a  channel  is  busy  a  sufficient  number  of 
times,  this  will  indicate  the  need  for  more  channel  power.  If,  for 
example,  there  are  no  more  IOM  ports  available,  a  large  amount  of  channel 
jqueuing  can  be  a  strong  indication  of  the  need  for  another  IOM.  An  entry 
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is  made  to  this  report  if  a  connect  to  a  given  device  is  delayed  because 
theru  are  no  available  logical  channels  and  there  is  no  active  connect 
being  processed  for  the  given  device.  Figure  8-8  shows  severe  channel 
contention  for  channel  16  IOM  0. 

8. 5 .6. 3  Channel  Free  and  Device  Busy  Report.  The  Channel  Free  and  Device 
Busy  Report,  figure  8-9,  is  generated  in  a  similar  manner  to  the  previous 
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Figure  8-6.  Physical  Device,  Device  ID  Correlation  Table 


CHANNEL  BUSY  AND  DEVICE  BUSY  REPORT  FOR  AZ-DP3  ON  83-08-04 
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CHANNEL  BUSY  AMD  DEVICE  TREE  REPORT  FOR  AZ-DP3  ON  83-08-04 
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Figure  8-8.  Channel  Busy  and  Device  Free  Report 


CHANNEL  FREE  AND  DEVICE  BUSY  REPORT  FOR  AZ-DP3  ON  83-08-04 
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two  reports.  It  is  an  indication  of  the  number  of  times  that  an  I/O 
request  was  queued  because  the  device  was  busy  but  there  were  available 
channels.  Significant  values  in  this  report  would  be  an  indication  that 
possible  relocation  of  files  is  required.  The  Mass  Store  Monitor  can  be 
used  with  this  data  tape  to  determine  the  files  being  accessed  on  this 
device.  Figure  8-9  shows  several  devices  on  channel  8  IOM  0  with 
significant  contention. 

8. 5. 6. 4  Channel  Free  and  Device  Free  Report.  The  Channel  Free  and  Device 
Free  Report,  figure  8-10,  is  an  indication  of  the  number  of  I/O  requests 
that  were  executed  by  IOS  immediately,  without  any  queuing.  This  report  is 
used  as  a  backup  to  the  previous  reports.  As  can  be  seen  in  figure  8-10, 
several  devices  on  channel  8  IOM  0  (14,  16,  17,  18,  19)  have  limited 
contention  in  that  between  15-35  percent  of  all  connects  are  delayed.  This 
should  be  confirmed  by  the  other  reports  (see  figure  8-9).  In  addition, 
this  figure  reconfirms  the  previous  discovery  that  channel  16  IOM  0  has 
serious  contention  problems  in  that  only  a  small  percentage  of  connects  are 
issued  without  any  delay. 

8. 5« 6. 5  GEPR  Connect  Report.  The  GEPR  Connect  Report,  figure  8-11,  shows 
the  number  of  times  that  a  trace  type  7  was  processed,  without  a  preceding 
trace  type  22.  This  unexpected  sequence  of  traces  is  supposed  to  occur 
whenever  the  system  processes  a  GEPR  (I/O  error)  event.  During  data 
collection,  the  CM  captures  an  I/O  status  word  indicator  and  the  data 
reduction  program  checks  this  status  word  to  verify  that  a  GEPR  has 
actually  occurred.  This  report  will  indicate  how  many  confirmed  GEPRs 
(i.e.,  the  status  word  was  set)  have  occurred  and  how  many  suspected  GEPRs 
have  occurred  (i.e.,  the  status  word  was  not  set).  This  report  will  be  of 
little  aid  to  the  analyst  and  CCTC  is  still  investigating  the  reason  for 
this  trace  occurrence,  when  the  status  word  is  not  set. 

8.5. 6. 6  Lost  Interrupt  Report.  The  Lost  Interrupt  Report,  figure  8-12, 
indicates  that  a  trace  type  7  is  being  processed  for  a  busy  device  and/or 
channel.  This  is  an  impossible  event  and  is  an  indication  that  a  trace 
type  4  has  not  been  generated.  This  is  usually  an  indication  that  the 
system  has  generated  a  lost  interrupt.  However,  if  lost  data  has  been 
generated  during  data  collection,  this  report  may  indicate  many  lost 
interrupts,  which  really  did  not  occur. 

8. 5* 6. 7  Device  ID  STIOS  Not  Connected  Report.  The  Device  ID  STIOS  Not 
Connected  Report,  figure  8-13,  shows  the  number  of  start  I/Os  that  were 
dropped  for  each  device.  A  start  I/O  is  dropped  when  it  is  found  that  a 
STIO  trace  has  occurred  for  a  device  and  within  a  user-defined  timeframe 
(5-second  default),  no  connect  has  been  received  for  the  start.  The  cause 
of  this  condition  currently  is  under  analysis.  This  report  is  also 
presented  by  the  device  ID  and  must  be  correlated  by  using  the  Device  ID 
Correlation  Report. 
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8. 5*6. 8  Entries  Still  In  Queue  Report.  The  Entries  Still  in  Queue  Report, 
figure  8-14,  shows  all  entries  remaining  in  the  CMDRP  queues.  This  report 
shows  the  device  number,  channel  number,  IOM  number,  queue  location,  and 
time  (in  milliseconds)  the  entry  has  been  in  the  queue.  These  entries  were 
active  when  the  monitor  terminated. 

8. 5*6. 9  Device  Free  But  Has  a  Queue  Report.  The  Device  Free  But  Has  a 
Queue  Report,  figure  8-14.1,  shows  the  number  of  times  a  connect  was  made 
to  a  device,  the  device  was  free  (no  active  connect  being  processed),  but 
yet  there  were  outstanding  requests  waiting  for  the  device.  In  most 
circumstances,  this  event  would  be  an  example  of  a  device  being  delayed 
because  of  the  lack  of  sufficient  channel  power  (see  subsection  8.3).  As 
can  be  seen  in  figure  8-14*1,  device  11  on  channel  16  IOM  0  had  this 
occurrence  for  19*9  percent  (3728  times)  of  all  its  connects.  Of  the  3728 
times  this  occurred,  3444  times  there  were  no  available  channels. 

However,  in  figure  8-14*1,  several  devices  report  this  occurrence,  but  yet 
indicate  that  there  were  free  channels.  Under  normal  circumstances,  if 
there  is  excess  channel  capacity  and  outstanding  I/O  for  a  device,  then 
that  device  should  begin  to  process  the  outstanding  I/O.  The  occurrence  of 
this  condition  is  an  indication  of  a  backlog  of  I/O  requests  that  the  GCOS 
system  is  unable  to  keep  pace  with.  The  channel  or  device  have  probably 
just  been  freed  and  GCOS  has  not  yet  been  able  to  process  the  next 
outstanding  request  for  that  device. 

8.5.6.10  500  Disk  Drive  Report.  The  500-type  disk  drives  are  actually  2 

logical  devices  which  are  configured  as  a  single  physical  drive. 

Therefore,  it  is  possible  for  a  connect  to  be  made  to  a  500  disk,  the  disk 
to  be  free,  but  yet  the  connect  cannot  be  issued  because  the  partner  500 
disk  (the  second  logical  device)  is  busy.  The  500  drives  are  configured  as 
odd/even  pairs.  Figure  8-14.2  describes  the  number  of  times  this  event 
occurs.  As  can  be  seen  in  this  figure,  device  11  on  channel  16  issued 
16762  connects  (89.6  percent  of  all  connects  issued  to  that  device)  that 
were  delayed  because  its  partner  device  (device  12)  was  busy.  The  average 
accumulated  queue  size  for  both  devices  when  this  event  occurred,  was  1.3* 
In  a  similar  manner,  14,715  connects  (67*4  percent  of  all  connects  issued 
to  device  12)  were  delayed  because  its  partner  device  (device  ll)  was 
busy.  This  is  an  excellent  example  of  device  contention  between  two 
different  disk  packs  that  are  physically  configured  on  a  single  drive. 

8.5.7  Idle  Monitor  Report  (File  57).  If  the  Idle  Monitor  was  active  when 
the  Channel  Monitor  was  running  an  Idle  Report  will  be  produced  next  (see 
figure  8-15).  The  first  few  lines  will  indicate,  for  each  processor,  the 
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Figure  8-14*2.  500  Disk  Drive  Report 


IDLE  RHORT  FOR  SYSTEM  0SCC2  ON  79-07-13 


¥ 


number  of  times  that  processor  went  idle  and  the  percent  idleness  of  the 
that  processor.  This  is  followed  by  an  average  idle  percentage  for  the 
entire  system.  The  next  line  of  output  indicates  for  what  percentage  of 
system  idle  time  there  was  active  disk  I/O;  i.e.,  I/O  in  progress  or  I/O 
request  queued.  This  figure  would  give  a  good  indication  as  to  whether  the 
CPU  is  going  idle  because  of  a  lack  of  work  or  rather  because  work  is  being 
delayed  by  the  slower  peripheral  devices.  In  figure  8-15.  we  see  that  for 
100  percent  of  the  idle  time,  the  CPU  was  actually  waiting  on  disk  I/O  to 
be  performed.  In  this  case,  the  CPU  could  be  put  to  better  utilization,  if 
only  the  speed  of  the  disk  subsystem  could  be  increased.  The  next  line  of 
output  indicates  how  many  outstanding  I/Os  were  present  when  the  system 
went  idle. 

Following  this  output,  a  table  is  generated  for  every  disk  device  that  had 
any  active  I/O  on  it  when  the  CPU  went  idle.  For  each  such  device,  the 
percent  of  Idle  Time  during  which  it  had  active  I/O  and  the  average  queue 
size  at  that  time  is  given.  In  figure  8-15,  we  see  that  during  95  percent 
of  CPU  idle  time  device  ID  #1  had  outstanding  I/O  to  an  average  length  of 
4.66.  Device  #4  had  active  I/O  for  46  percent  of  the  CPU  idle  time  with  an 
average  length  of  1.2.  By  examining  this  report,  we  can  see  that  if  the 
queues  on  these  two  devices  could  be  reduced  there  is  sufficient  CPU  power 
(available  to  handle  the  additional  workload  that  would  be  generated.  The 
Device  ID  Correlation  Report  should  be  used  to  convert  a  unique  device  ID 
into  an  IOM,  PUB,  device  number.  This  report  was  generated  from  a 
different  monitoring  session  than  all  previous  reports  described  thus  far. 

8.5*8  Proportionate  Device  Utilization  Report  (File  57).  This  report 
shows  the  proportionate  utilization  of  each  device  configured  on  the  mass 
storage  subsystem.  Figure  8-16  is  an  example.  This  histogram  identifies 
each  unique  device  ID  (device  number  zero  is  an  MFC  controller)  and 
provides  both  a  count  of  the  number  of  accesses  made  to  each  device  (under 
the  column  headed  INDIV.  NUMBER)  as  well  as  the  percent  of  all  accesses 
which  were  to  each  device  (under  the  column  headed  INDIV.  PRC).  The 
histogram  shows  the  proportionate  utilization  of  each  device  (i.e.,  the 
percent  of  all  accesses  which  went  to  each  device)  in  a  graphical  form. 

The  physical  device  that  each  "Device  ID"  of  the  histogram  represents  is 
shown  in  the  Physical  Device  ID  Correlation  Table  (see  figure  8-6).  This 
report  is  always  generated  and  cannot  be  turned  off.  In  this  report  the 
user  is  looking  for  a  device  or  devices  which  have  significantly  more 
utilization  than  others  in  the  system.  This  highly  used  device  would  then 
be  a  potential  bottleneck. 

It  is  desirable,  but  not  always  practical,  to  have  equal  utilization  for 
each  device.  The  user  should  be  reminded  that  data  in  figure  8-16  is 
cumulative  over  the  monitoring  period.  The  actual  accessing  pattern  could 
have  been  periodic  with  the  following  form:  Many  accesses  to  device  4 
followed  by  many  accesses  to  device  3  followed  by  many  accesses  to  device  2 
followed  by  many  accesses  to  device  1,  etc.  Each  device  could  have  been  a 
bottleneck  for  a  subperiod  of  the  total  monitoring  period.  This  could  also 
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Figure  8-16.1.  Device  Utilization  Report  -  Part 


have  been  the  case  if  the  proportionate  utilization  of  each  device  was 
equal.  The  Channel  Monitor  can  be  used  to  uncover  this  cyclic  type  of 
usage.  In  addition,  the  Chronological  Device  Utilization  Report  (see 
subsection  7.5.18)  was  designed  to  uncover  this  type  of  problem  by  breaking 
down  device  utilization  over  time,  rather  than  by  utilizing  a  histogram. 
Nevertheless,  when  a  single  or  small  number  of  devices  has  a 
disproportionately  large  share  of  the  accesses,  they  are  potential 
bottlenecks  and  their  usage  should  be  further  analyzed. 

This  report  will  show  all  connects  that  were  issued  to  a  given  device. 

This  includes  all  read/write  connects,  as  well  as  any  command  type  connects 
issued  to  a  given  device. (See  subsection  7*3). 

This  histogram  can  report  a  maximum  of  50  devices.  If  a  site  is  configured 
with  more  than  50  devices,  a  second  report  will  be  produced,  as  a 
continuation  of  Report  1  (see  figure  8-16.1). 

8-5«9  Queue  Length  and  Queue  Time  Histograms  (File  57).  Figure  8-17  shows 
the  I/O  queue  length  and  queue  time  histograms  for  the  I/O  requests  to 
devices  as  they  are  processed  by  IOS.  These  reports  occur  in  pairs,  one 
pair  for  each  device.  These  reports  are  generated  only  for  mass  store 
devices.  The  first  report  in  the  pair  is  a  length  report  and  the  second  is 
a  time  report.  The  histogram  is  read  in  the  same  manner  as  the 
Proportionate  Device  Utilization  Report.  In  addition  to  individual 
percentages,  cumulative  percentages  are  also  reported.  In  figure  8-17 

I  (part  2)  we  see  that  90.279?  of  all  requests  have  a  queue  time  of  12  ms  or 
less  while  91.677?  of  all  requests  have  a  queue  time  of  22  ms  or  less.  The 
number  of  entries  in  these  two  reports  might  not  be  equal.  The  first 
report  is  generated  at  the  time  of  the  request.  When  an  I/O  request  is 
made,  an  entry  is  made  to  the  histogram  indicating  the  number  of  requests 
outstanding  or  in  progress  to  this  device.  The  second  report  is  generated 
at  the  time  of  the  actual  connect  and  indicates  the  length  of  time  between 
the  request  and  the  actual  connect.  Since  observations  have  indicated  that 
some  STIOS  are  never  connected,  the  first  report  may  have  a  higher  number 
of  entries  than  the  second.  These  two  reports  can  be  correlated  by 
subtracting  the  number  of  STIOS  not  connected  and  the  number  of  entries 
still  in  queue,  for  a  given  device,  from  the  number  of  entries  in  the  queue 
length  histogram.  Figure  8-17  (part  l)  shows  that  736  times  there  was  a 
queue  on  device  9.  This  figure  should  correlate  with  the  number  of  times 
device  9  was  reported  busy  on  the  Channel  Busy-Device  Busy  Report  and  the 
Channel  Free-Device  Busy  Report.  The  two  previous  reports  show  the 
relationship  between  device  and  channel  contention,  while  this  histogram 
can  be  used  for  measuring  the  depth  of  queuing  and  the  distribution  of 
queue  lengths. 

Figure  8-18  shows  the  channel  queue  length  and  queue  time  for  the  I/O  queue 
entries  as  they  are  used  in  the  system  channel  environment.  These 
histograms  will  be  produced  for  both  mass  store  and  tape  channels.  Once 
again,  the  number  of  entries  in  these  figures  will  not  correlate  because  of 

I  STIOS  not  connected  that  were  issued  over  a  given  channel,  and  not  over  a 
single  device. 
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Figure  8-17.  (Part  2  of  2) 
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The  user  should  refer  to  subsection  8.3  for  a  complete  description  of  the 
methodology  involved  in  making  entries  to  these  reports. 

Figure  8-18  (part  l)  shows  that  41339  times  there  was  a  queue  on  channel 
16.  This  figure  should  correlate  with  the  number  of  times  channel  16  was 
reported  busy  on  the  Channel  Busy-Device  Busy  Report  and  the  Channel 
Busy-Device  Free  Report.  The  two  previous  reports  show  the  relationship 
between  device  and  channel  contention,  while  this  histogram  can  be  used  for 
measuring  the  depth  of  queuing  and  the  distribution  of  queue  lengths. 

8.5*10  Service  Time  Histograms  (File  57).  In  all  of  the  device 
histograms,  it  should  be  noted  that  the  name  of  the  device  is  also  given. 

In  figure  8-17,  queuing  statistics  were  presented  for  a  device  with  the 
name  DPC.  If  an  exchange  took  place  and  the  DPC  disk  pack  was  moved  from 
0-08-09  to  0-08-12,  the  data  reduction  program  will  account  for  that 
exchange  and  any  connects  that  are  made  to  0-08-12  will  be  reported  on  this 
histogram  and  not  to  the  0-08-09  histogram. 

In  the  upper  right-hand  corner  of  the  report,  a  report  number  is 
indicated.  This  report  number  is  used  only  to  distinguish  one  histogram 
from  another  and  in  no  way  indicates  the  device  to  which  the  report 
refers.  In  addition,  report  numbers  may  not  appear  sequentially  and  this 
in  no  way  is  indicative  of  a  problem.  All  histogram  reports  are  generated 
by  default  and  may  be  turned  off  by  using  the  LIMITS  option  (subsection 
8.6.13). 

Figure  8-19  shows  the  I/O  service  time  histograms  for  each  I/O  channel  and 
device.  Each  histogram  is  given  in  2ms  intervals.  The  I/O  service  time  is 
defined  as  the  time  (in  ms)  from  connect  to  the  time  that  IOS  processes  the 
terminate  interrupt  for  the  I/O  request.  These  histograms  are  generated 
for  both  mass  store  and  tape  channels,  as  well  as  all  mass  store  devices. 

On  the  bottom  line  of  this  report,  an  indication  is  given  as  to  the  percent 
of  total  time  that  this  device  or  channel  was  busy. 

The  three  device-oriented  histograms,  just  described,  have  entries  placed 
in  them  for  every  connect  issued  to  the  device  (not  just  multicommands  such 
as  seek-read  or  seek-write). 
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Figure  8-19-  1/0  Service  Time  Report  (Part  1  of  2) 
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£  IDENT  1820251/30/3044 

$  SELECT  B29IDPX0/0BJECT/CM 

i  TAPE  01, X1D, ,12345 

$  LIMITS  99,50K,-4K,30K 

$  DATA  I* 

(The  following  is  a  suggested  set  of  data  cards) 

JOB 

2 

TS1  FTS 
LIMITS 
SUMARY 
END 


Figure  8-25.  CMDRP  JCL 
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tape  labels).  If  the  operator  answers  K,  the  message  In  (c) 

Is  produced.  If  the  operator  answers  Y,  the  data  reduction 
program  will  terminate  and  all  reports  will  be  produced.  In  this 
case,  the  data  reduction  program  Is  unable  to  process  the  tape. 
Sven  though  the  operator  Is  mounting  the  correct  tape,  the 
internal  label  on  the  new  tape  does  not  match  that  being  requested 
by  the  old  tape.  The  user  should  cheek  the  data  collection 
session  to  insure  that  the  operator  did  not  respond  with  an 
incorrect  tape  number  during  multireel  change. 

After  entering  the  Y  or  N,  the  operator  will  need  to  hit  the  EOM 
key  twice  in  order  for  the  response  to  be  transmitted. 

c.  WRONG  REEL  JUST  MOUNTED,  DISMOUNT  AND  MOUNT  REEL  XXXXX  OR  ZZ2ZZ 

In  this  case,  XXXXX  is  the  new  reel  number,  and  ZZZZZ  is  the  tape 
drive  ID. 

d.  CAN  TAPE  XXXXX  BE  MOUNTED  ON  DRIVE  YYYYY  (Y/N) 

In  this  case,  XXXXX  is  the  new  desired  reel  and  YYYYY  is  the  tape 
drive  ID. 

If  the  operator  fails  to  answer  this  message  it  will  be  repeated 
until  he  responds  with  a  "Y"  for  YES  or  "N"  for  NO.  If  he  typeB 
in  "Y",  then  message  (a)  will  be  repeated.  If  he  types  in  "N", 
then  the  program  will  be  terminated  and  all  reports  will  be 
produced . 

8.9  Tape  Error  Aborts 

During  the  course  of  processing  it  is  possible  that  the  operator  will  be 
required  to  abort  the  data  reduction  program  due  to  an  irrecoverable  tape 
error.  If  such  a  condition  occurs,  the  operator  should  abort  the  Job  with 
a  "U"  abort.  This  will  allow  the  data  reduction  program  to  enter  its 
wrap-up  code  processing  and  produce  all  reports  generated  prior  to  the  tape 
error. 
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CPU  AND  TAPE  REDUCTION,  VERSION  7.2  -  11.74,  24  SEP  1982 
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SECTION  14.  CONDUCTING  A  SITE  COMPUTER  PERFORMANCE  EVALUATION  USING 
THE  GMC 


14.1  Introduction 

A  general  plan  for  using  the  GMC  System  and  document  when  conducting  a 
performance  evaluation  of  a  Honeywell  6000  computer  system  is  presented  in 
this  section.  Detailed  analysis  procedures  that  guide  the  analyst  in 
applying  specific  techniques  to  analyze  system  performance  are  introduced. 
The  primary  purpose  is  to  aid  the  analyst  in  his  use  of  the  GMC  reports. 

This  section  will  indicate  which  reports  the  analyst  should  reference,  what 
data  he  should  extract  from  those  reports,  and  some  guidance  as  to  what  the 
reports  are  indicating.  In  some  cases,  possible  corrective  ("tuning") 
recommendations  will  be  made,  but  tuning  guidance  is  not  the  primary  purpose 
behind  this  section.  If  a  tuning  study  should  be  made,  the  user  must  rely 
on  his  own  personnel  skills  and  the  tools  of  GMF  to  perform  such  a  study. 

14.2  General  Definitions 

14.2.1  Computer  Performance  Evaluation  (CPE).  Computer  performance 
evaluation  is  a  generic  term  applied  to  many  techniques  for  determining  the 
performance  characteristics  of  both  a  computer  system  and  its  associated 
site  processing  activities.  The  performance  characterisitics  may  be 
compared  to  many  criteria,  including:  (l)  standards  of  economic  operation, 
(2)  technical  norms,  or  (3)  measures  of  service  provided. 

14.2.2  Computer  System  Performance  Variables.  The  performance  of  a 
computer  system  is  influenced  by  nearly  every  facet  of  the  data  processing 
function.  The  following  examples  define  the  scope  of  the  computer 
performance  function. 

14.2.2.1  System  Design.  Computer  application  system  design  and  development 
can  be  the  starting  point  of  performance  degradation.  Errors  in  original 
design  with  respect  to  I/O  media  selection,  file  structures,  frequency  of 
run,  etc.,  may  result  in  less  than  optimal  performance  for  as  long  as  an 
application  is  in  existence. 

14.2.2.2  Programming.  A  computer  programmer's  proficiency  and  the 
availability  of  program  optimization  tools,  for  example,  will  influence 
program  design  and  coding,  and  affect  system  performance. 

14.2.2.3  Hardware  Configuration.  Specific  components  of  a  computer  system 
may  be  mismatched  to  the  system  as  a  whole,  causing  major  subsystems  (or  the 
entire  configuration)  to  operate  at  a  reduced  performance  level.  Even  if 
the  performance  capabilities  of  the  individual  subsystems  are  reasonably 
well-matched,  the  system  may  be  poorly  configured  for  the  site's  workload, 
resulting  in  poor  performance. 
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14.2.2.4  System  Software.  System  software  is  considered  to  be  those 
programs  supplied  by  the  mainframe  vendor.  This  software  may  be 
inappropriately  parameterized  to  fit  the  site  workload,  or  may  be  a  source 
of  high  overhead  or  bottlenecks  to  efficient  workload  processing. 

14.2.2.5  Operations.  Operations  consists  of  functions  3uch  as  the 
scheduling  of  the  workload,  providing  job  assembly  (and  library)  services, 
and  operating  the  system  through  the  console.  All  of  these  functions  are 
vital  to  the  proper  operation  of  the  system.  Mistakes,  insufficient 
training,  poor  documentation,  and  a  variety  of  other  factors  may  contribute 
to  operational  problems  which  substantially  decrease  system  performance. 

14.2.2.6  Communications  Hardware  and  Software.  A  communications  network, 
its  interface  to  a  central  system,  and  the  software  used  to  control  the 
on-line  applications,  may  have  a  significant  impact  on  the  system's  overall 
performance. 

14.2.2.7  Computer  System  Performance  Tuning.  The  process  of  analyzing  and 
appropriately  adjusting  computer  system  performance  variables  is  known  as 
computer  system  performance  tuning.  After  becoming  proficient  in  the  use  of 
GMF,  the  user  should  be  able  to  conduct  his  own  tuning  study. 

14.2.2.8  Turnaround  Time.  This  is  the  total  elapsed  time  taken  by  a  job 
(or  set  of  jobs)  submitted  to  a  WWMCCS  site  for  batch  processing.  Total 
batch  turnaround  time  is  comprised  of  computer  system  processing  and 
physical  input  and  output  handling  in  the  machine  room,  both  before  and 
after  system  processing.  A  job's  total  turnaround  time  therefore  includes 
all  processing  and  waiting  points  through  which  the  job  must  pass  from 
submission  until  return  to  a  user.  The  GMC  measures  only  a  batch  job’s 
computer  processing  turnaround  time.  The  other  measure  of  total  turnaround 
time  (input/output  handling)  are  important  and  can  be  a  bottleneck  to  total 
turnaround  time. 

14.2.2.9  CPE  User  Objectives.  The  definitions  associated  with  any  or  all 
of  the  above  data  processing  functions  can  be  considered  variables  that 
affect  computer  system  performance.  The  GMC  reports  should  be  used  to 
determine  which  of  the  above  functions  need  further  investigation. 

14 . 3  Solutions  to  Performance  Problems 

Particular  resource  bottlenecks  may  be  confirmed  as  elongating  turnaround  or 
response  time.  Several  solutions  can  usually  be  applied  to  remove  a 
particular  bottleneck.  In  general,  four  kinds  of  solutions  exist  to  remove 
identified  bottlenecks. 

14.3.1  Scheduling  Solutions.  These  solutions  change  the  way  that  either 
batch  or  TSS  workloads  are  scheduled  for  processing.  They  shift  particular 
workloads  to  more  evenly  distribute  system  resources  across  the  workload. 
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14.3.2  Parameter  Solutions.  These  changes  involve  adjustments  to  system  or 
subsystem  functions.  Examples  include:  (l)  changes  to  the  parameters  of 
the  GCOS  Dispatcher  or  (2)  a  change  in  the  placement  of  GCOS  libraries.  A 
solution  may  also  include  specific  changes  to  GCOS  code,  made  through 
authorized  software  patch  procedures. 

14.3.3  Programming  Solutions.  These  changes  can  involve  modification  of 
one  or  more  processing  jobs  running  in  the  system.  Recommendations  may  be 
made  to  speed  application  of  jobs  by  changing  particular  file  locations 
discovered  as  delaying  the  job. 

14.3*4  Sizing  Solutions.  These  types  of  system  changes  involve  an 
increase,  decrease,  or  realignment  in  the  system's  hardware  configuration. 

14.4  Structure  of  the  Analysis  Process 

The  analysis  process  in  figure  14-1  is  a  flow  chart  that  should  be 
referenced  during  a  performance  study.  The  analysis  process  is  comprised  of 
two  phases:  (l)  a  Problem  Definition  Phase  and  (2)  a  Problem  Analysis 
Phase.  The  activities  of  the  Problem  Definition  Phase  are  directed  toward 
determining  whether  a  batch  turnaround  time  or  TSS  response  time  problem 
actually  exists.  The  activities  of  the  Problem  Analysis  Phase  are  directed 
toward  revealing  causes  of  the  identified  turnaround  time  or  response  time 
problem. 


14.4.1  Starting  The  Process.  The  evaluation  process  may  begin  in  one  of 
three  ways:  ( l)  by  direct  request,  (2)  by  an  internal  review  of 
site-selected  service  elongation  metrics,  (3)  by  user  requests. 

14.4.1.1  Direct  Request.  The  request  may  by  initiated  by  WMCCS 

management,  directing  a  facility  to  perform  an  analysis  and  tuning  effort. 
Many  reasons  could  cause  management  to  request  a  CPE  study.  Examples 
include:  (l)  desire  to  extend  the  life  of  a  system,  (2)  pressure  applied 

from  users  through  higher  authority,  (3)  awareness  of  the  potential  for 
performance  gain  after  an  evaluation,  and  (4)  a  desire  for  management  to 
evaluate  cost  reduction  programs. 

14.4.1.2  Internal  Review.  The  decision  to  initiate  a  study  may  result  from 
the  output  of  a  performance  exception  reporting  system  or  from  the  desire  to 
conduct  a  periodic  review  of  site  operations. 

14.4.1.3  User  Requests.  Users  of  site  services  may  request  an  evaluation. 
Complaints  of  unacceptable  batch  turnaround  time  or  TSS  response  time  can 
initiate  a  search  for  their  causes.  Unfavorable  comparisons  with  service 
rates  provided  at  other  installations  can  point  out  the  need  for  a  study  to 
determine  if  service  can  be  improved. 


14.4.2  Problem  Definition  Phase.  The  Problem  Definition  Phase  (see  figure 
14-1,  part  1)  is  comprised  of  four  activities:  (l)  define  and  verify  the 
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Figure  14-1.  Flowchart  Of  Eve lua t ion /Tuning  Process  (Parc  1  of  2) 


problem,  (2)  gain  understanding  of  facility  environment,  (3)  understand 
installation  service  objectives  and  priorities,  and  (4)  specify  current 
evaluation  objectives.  Some  of  these  activities  are  initiated  as  evaluation 
begins.  Others  are  maintained  as  on-going  activities.  The  activities  of 
the  Problem  Definition  Phase  are  introduced  below. 

14.4.2.1  Define  and  Verify  the  Problem.  For  whatever  reason  the  evaluation 
was  initiated,  a  starting  point  or  premise  must  be  established.  The  problem 
must  be  described  as  well  as  it  can  at  this  point,  even  if  at  a  very  general 
level.  "Batch  turnaround  time  is  poor"  or  "TSS  response  time  must  be 
improved"  are  valid  statements  of  problems  at  this  time.  Site  management 
must  then  verify  that  this  is  the  problem  they  wish  to  pursue.  The  facility 
staff  and  users  should  verify  that,  in  fact,  this  problem  does  exist  and 
that  they  view  it  as  a  problem  of  importance.  Note  that  a  specific 
evaluation  objective  is  not  yet  defined;  only  the  basic  problem  statement  is 
verified . 


14.4.2.2  Gain  Understanding  of  Facility  Environment.  This  second  step 
helps  the  CPE  analyst  relate  the  system  environment  to  the  problem 
statement.  This  activity  may  be  time-consuming  if  the  analyst  has  little 
personal  experience  with  the  facility  and  the  system.  Information  collected 
during  this  activity  will  help  determine  the  reason  for  the  problem's 
importance;  it  may  explain  a  reason  for  the  problem's  existence  and  help 
rank  one  problem  against  another.  The  activity  forces  an  analyst  to  vi«v 
the  entire  facility;  this  is  important  since  many  performance  problems  are 
caused  by  combinations  of  factors.  The  activity  assists  the  analyst  in 
understanding  how  to  narrow,  refine,  modify,  and  improve  the  problem's 
definition  in  order  to  attempt  a  valid  solution.  Each  area  described  in  the 
following  paragraphs  should  be  examined.  It  may  help  the  analyst  to  examine 
them  in  the  order  they  are  given  below. 

14.4.2.2.1  Hardware  Configuration.  The  exact  system  configuration  should 
be  determined  and  a  diagram  of  the  overall  structure  of  the  system  should  be 
created.  Collection  of  reliability  statistics  on  major  configuration 
components  and  a  history  of  hardware  changes  to  the  conf iguration  should  be 
done. 


14.4.2.2.2  Software  Configuration  and  Development  Practices.  The  exact 
operating  system  configuration  should  be  determined.  Any  local  changes  made 
to  the  system  and  any  specialized  major  subsystems  that  are  running  should 
be  documented.  Document  software  monitors  and  other  CPE  measurement 
techniques  used.  Determine  program  optimization  techniques  used  and  note 
standards  imposed  on  operations  and  programming  staffs. 

14.4.2.2.3  Existing  CPE  Practices.  Responsibility  for  computer  performance 
evaluation  at  the  site  must  be  determined.  Identify  sources  of  CPE  data 
that  are  employed  and  determine  how  the  data  are  used  to  evaluate  system 
performance.  Determine  if  personnel  in  all  site  functional  areas  (e.g., 
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programming  and  operations)  can  relate  to  the  performance  data.  Research 
how  previous  CPE  studies  conducted  at  the  site  in  the  past  have  been 
documented . 

14.4.2.2.4  Site  Workload  Characteristics.  The  existing  workload  uses  of 
system  resources  should  be  described.  Identify  patterns  of  resource  use  by 
selected  jobs.  Obvious  bottlenecks  to  the  handling  of  jobs  within  the 
installation  should  be  determined.  The  resource  monitor,  the  RMC  portion  of 
GMF,  should  prove  very  useful  during  this  phase. 

14.4.2.2.5  System  Users.  The  workload  analysis  should  be  expanded  by 
investigating  the  practices  of  the  major  users  in  the  installation. 

Determine  how  special  priorities  are  assigned  to  these  users  and  whether  the 
users  can  directly  control  system  resources.  Note  chargeback  schemes  used 
(if  any)  to  level  the  workload  across  the  operating  day  and  night.  Evaluate 
the  levels  of  user  satisfaction  (or  dissatisfaction)  exhibited. 

14.4.2.2.6  Operations  Practices.  The  operating  shift  schedule  for  the  site 
should  be  examined.  Analyze  pre-scheduled  and  nonproduction  time  and 
unscheduled  nonproduction  time.  Evaluate  training  provided  for  operators 
and  systems  programmers.  Study  site  library  maintenance  procedures. 

Examine  the  production  control  points  established  by  operations.  Observe 
how  formalized  logs  are  maintained  to  track  work  as  it  passes  through  the 
installation. 

14.4.2.2.7  Batch  Job  Scheduling.  Observe  how  batch  work  is  accepted  for 
processing  at  the  site.  Study  the  techniques  used  to  schedule  the  batch 
workload  that  are  either  automated  or  are  manually  implemented. 

14.4.2.2.8  Site  and  Computer  Facility  Organization.  Study  the  organization 
structure  and  reporting  authority  at  the  site.  This  includes  the 
organization  of  the  site  itself.  Determine  the  extent  to  which  system 
sizing  activities  are  organizationally  separated  from  operations  and 
applications  systems  development. 

14.4.2.2.9  CPE  Checklist.  The  use  of  a  CPE  checklist  will  greatly  aid  the 
analyst  in  his  understanding  of  the  facility  environment.  Figure  14-2  is  an 
example  of  such  a  checklist. 

14.4.2.3  Understand  Installation  Service  Objectives  and  Priorities.  This 
third  step  determines  site  processing  objectives  and  priorities  to  be 
applied  to  decisions  made  during  both  the  Problem  Definition  and  Problem 
Analysis  Phases. 

14.4.2.3.1  Installation  Service  Objectives.  To  be  effective,  a  CPE  effort 
must  consider  the  hardware,  software,  personnel,  and  service  objectives  of 
the  computer  installation  being  analyzed.  Although  the  first  three  areas 
are  readily  identifiable,  the  site's  service  objectives  are  often 
misunderstood.  An  installation's  objectives  can  emphasize  production  or 
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A.  Configuration: 


_ 1.  MPC/PSIA  crossbar  check 

_ 2.  Trace  setting/SCF  collection 

_ 3.  CATDUP  off  unless  device  saves/restores  are  done 

_ 4.  NIAST  buffer  count 

_ 5.  TSS  swap  file  location/size  (2200) 

_ 6.  Placement  of  system  files  on  low-use  devices 

_ 7.  SYSOUT  files  4  or  6  —  do  not  use  5 

_ 8.  No  module  CKSUM 

_ 9.  SSLOAD  maximum  20-30  based  on  workload 

_ 10.  IOM-lines 

_ 11.  HCM  available  for  other  modules 

_ 12.  SCOMM  -  set  for  intercom  I/O  area.  Should  not  be  128 

B.  Options: 

_ 1.  Dispatcher 

_ a.  Urgency  thruput 

_ b.  MCOUNT  off 

_ c.  Priority  B  (why) 

_ 2.  TSS 

_ a.  Devices  for  swap  files 

_ b.  Large  user  penalty 

_ c.  Priority  B  (NO) 

_ d.  Round  Robin  swap  files  (Reston  patch  number) 

_ e.  4  swap  files  -  large  system 

Figure  14-2.  CPE  Checklist  (Part  1  of  3 ) 
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_ 3.  FMS  Cache  #A2E 

_ 4.  Memory  move  quantity  #AOG 

_ 5.  SSA  in  memory  pushdown  #A1E 

bit  3  =  1 

_ 6.  SSA  Cache  size  #A1Y 

_ 7.  GRTS 

_ a.  DBIAS.BBIAS  =  10008 

_ b.  MAXMSG  =  6  if  load  5600§ 

_ c.  Line  size  setting 

_ d.  Trace  setting 

C.  Hardware; 

_ 1.  Memory  interleaving 

_ 2.  MPC/PSIA  connection 

D.  Miscellaneous: 

_ 1.  MSCAN  to  set  up  scheduler  classes  based  on  required  resources 

_ 2.  Limit  use  of  high  urgency 

_ _ 3.  USERID  hash 

_ 4.  Cold  Boot  frequency  (FRAG) 

_ 5.  TPAPs  not  needed 

_ 6.  Jobs  waiting  peripherals  on  VIDEO  being  reviewed  by  operator 

(overdue  allocation  problem) 

E.  Memory  Evaluation 

F.  CPU  Evaluation 

Figure  14-2.  (Part  2  of  3) 
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availability,  or  a  mixture  of  both.  A  mixture  is  generally  dominated  by 
either  a  production  or  availability  objective.  A  production  objective 
attempts  to  take  full  advantage  of  the  capabilities  of  the  system  in  terns 
of  throughput.  With  a  customer-oriented  availability  objective,  however, 
the  interests  of  the  users  take  priority  over  the  most  efficient  use  of 
system  resources.  In  either  case,  the  computer  installation's  management 
objective  is  return  on  investment.  For  the  production  objective,  the  most 
important  investment  is  the  computer;  for  the  availability  objective,  the 
most  important  investment  is  the  people  or  system  using  the  computer.  The 
difference  between  these  two  objectives  is  reflected  in  how  the  performance 
parameters  are  evaluated. 

14.4.2.3.1.1  Production  Objective.  Traditional  data  processing  is 
production-oriented.  The  computer  installation  is  viewed  as  a  large 
investment  for  capacity  to  do  routine  work.  Much  of  that  capacity  depends 
on  a  high  degree  of  simultaneous  use  of  many  system  components. 

Management's  production  objective  is  to  use  as  many  of  the  components  as 
much  of  the  time  as  possible  by  scheduling  compatible  jobs.  Under  these 
conditions,  low  batch  turnaround  and  TSS  response  times  are  secondary  to 
high  machine  utilization.  Management's  success  is  generally  defined  in 
terms  of  a  relatively  high  number  of  jobs  contending  for  computer  resources 
(a  high  multiprogramming  level),  high  central  processor  unit  (CPU)  activity, 
nearly  full  memory  utilization,  and  highly  active  channels.  Schedule-driven 
processing  uses  internal  and  external  priority  allocations  to  sustain 
efficiency.  Programs  must  be  written  to  share  available  resources.  This 
can  be  done  with  program  segmentation  techniques,  use  of  a  minimum  number  of 
devices,  the  indirect  use  of  input/output  through  spooling,  and  adherence  tc 
rigid  standards  —  all  at  the  expense  of  the  individual  job.  Data  must  be 
so  distributed  that  device  activity  is  economically  justified  with  the 
attainment  of  a  low  system  wait,  low  unit  cost,  and  high  simultaneity. 

14.4.2.3.1.2  Availability  Objective.  Service-oriented  systems  are  more 
concerned  with  return  on  investment  for  users  (managers,  scientists,  or 
programmers)  than  with  the  computer  itself.  Low  TSS  response  and  batch 
turnaround  times  are  critical.  Their  increase  delays  user  operations  on 
such  applications  as  on-line  command  and  control  systems,  real-time  update 
systems,  scientific  support  services,  and  program  development  systems. 
Demand-driven  processing  varies  in  activity  levels,  both  by  users  and  type 
of  work.  Minimum  emphasis  is  placed  on  scheduing.  The  system  must 
therefore  have  available  capacity  ready  to  respond  to  demand.  Such  a  systei 
must  have  utilization  rates  well  below  the  10051!  limit  in  order  to 
accommodate  the  variations  in  demand.  Success  is  measured  in  terms  of  user 
satisfaction  and  little  emphasis  is  placed  on  reporting  high  utilization 
figures. 

14.4.2.3.1.3  Mixed  Objectives.  Usually  computer  installations  encounter 
(l)  demands  for  highly  responsive  services  and  (2)  pressure  from  management 
for  high  production  rates.  The  two  objectives  are  not  mutually  exclusive. 


Predominance  of  either  objective  can  be  identified  within  operational  time 
periods,  and  management  must  evaluate  whether  the  satisfaction  of  one 
objective  might  deleteriously  affect  the  satisfaction  of  another. 

Computer  performance  evaluation  can  serve  the  management  of  either  type  of 
computer  installation  and  can  also  serve  mixtures  that  might  have  different 
objectives,  depending  on  the  time  of  day  or  day  of  the  week.  However,  it  is 
important  that  managers,  auditors,  and  executives  recognize  the  implications 
of  the  different  objectives  when  they  compare  the  performance  of  one 
installation  to  another. 

14.4.2.3.2  Installation  Priorities.  The  installation's  priorities  derive 
from  the  predominance  of  either  the  production  objective  or  the  availability 
objective  at  a  particular  site.  They  also,  by  implication,  determine  the 
sequence  in  which  system  tuning  solutions  are  applied. 

14.4.2.3.2.1  Service  Priorities.  A  site  may  feel  that  low  TSS  response 
time  is  more  desirable  than  low  batch  turnaround  time  for  a  certain  period 
of  the  operating  day.  The  analysis  procedures  presented  in  this  section  are 
not  generally  directed  toward  determining  which  of  the  two  is  "more 
important".  However,  the  sequence  of  examining  either  of  the  two  service 
areas  may  be  affected  by  this  priority. 

14.4.2.3.2.2  Evaluation/Tuning  Solution  Priorities.  Tuning  solutions  are 
presented  to  correct  system  bottleneck  conditions.  In  nearly  all  cases, 
more  than  one  solution  is  specified  to  correct  a  problem.  The  solutions  are 
generally  proposed  in  a  sequence  that  recommends  the  more  quickly  (or 
easily)  applied  solution  be  implemented  before  others.  Installation 
requirements  may  change  this  sequence. 

14.4.2.4  Specify  Current  Evaluation  Objective.  This  fourth  step  is  used  to 
determine  whether  the  originally  specified  problem  can  actually  be 
investigated  with  currently-available  procedures. 

14.4.2.4.1  Objective.  An  objective  is  a  stated  (i.e.,  documented)  goal  of 
an  analysis.  An  objective  must  be  stated  in  specific  and  quantified  terms. 
The  objective  statement  is  used  to  determine  when  a  particular  analysis  has 
been  completed.  Examples  of  well-stated  objectives  are:  (l)  "reduce  mean 
response  time  for  (stated)  non-trivial  TSS  commands  to  five  seconds"  and  (2) 
"reduce  the  mean  batch  turnaround  time  for  (stated)  jobs  to  1.5  hours."  A 
well-stated  objective  includes:  (l)  definition  of  the  workload  catagory, 

(2)  a  description  of  the  process  to  be  investigated,  and  (3)  a  service 
metric  value.  Examples  of  badly-stated  objectives  are  "reduce  TSS  response 
time"  and  "improve  turnaround  time."  Note  the  missing  components  of  these 
objectives. 

14.4.2.4.2  Objective  Decision.  Determine  whether  the  objective  as 
documented  is  realistic,  attainable,  and  a  cost  effective  target. 
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14.4.2.4.2.1  Attainable  Objectives.  Objectives  must  be  attainable.  It 
might  be  possible  to  reduce  a  program's  elapsed  time  to  certain  limits,  but 
not  to  a  desired  limit,  simply  because  of  the  quantity  of  I/O  and 
computation  that  occurs  within  the  program.  If  this  is  the  case, 
re-evaluate  the  objective. 

14.4.2.4.2.2  Realistic  Objectives.  It  might  be  possible  to  reduce  a 
program's  elapsed  time  to  the  desired  amount.  However,  particular 

site  constraints  may  prevent  its  being  a  realistic  goal.  These  constraints 
might  have  their  source  outside  of  the  tuning  effort,  but  directly  affect 
the  internal  performance  of  the  program.  These  might  include,  for  example, 
the  requirement  to  give  certain  other  jobs  higher  priority.  If  this  is  the 
case,  re-evaluate  the  objective. 

14.4.2.4.2.3  Cost-Effective  Objectives.  The  additional  effort,  cost,  and 
time  required  to  achieve  an  attainable  and  realistic  objective  might  not  be 
worth  it.  A  particular  increment  of  performance  improvement  might  simply 
not  be  cost  effective.  Determine  the  amount  of  improvement  likely  with 
additional  effort.  Decide  whether  the  effort  might  be  better  off  abandoned. 

14.4.2.4.3  Determine  if  Worth  Continuing.  The  final  part  of  step  four  is 
to  determine  whether  the  process  itself  is  worth  continuing.  There  may  be 
potential  performance  problems  of  greater  magnitude  or  immediate  importance 
that  may  have  been  uncovered  during  the  Problem  Definition  Phase.  Document 
the  decision  and  the  objective  and  obtain  management  approval  of  them. 

14.4.2.4.4  Begin  Problem  Analysis.  Having  obtained  concurrence  from 
management,  begin  the  second  half  of  the  performance  evaluation  process  (see 
figure  14-1,  part  2):  the  Problem  Analysis  Phase.  Activities  of  this  phase 
involve  the  execution  of  the  various  GMC  monitors  and  the  analysis  of  their 
output. 

14.4.3  Problem  Analysis  Phase.  The  Problem  Analysis  Phase  (see  figure 
14-1,  part  2)  is  comprised  of  five  activities  briefly  described  below: 

14.4.3.1  Run  Appropriate  Analysis  Tool.  Analysis  requires  collection  of 
data  to  start  an  investigation.  The  various  GMC  monitors  have  been 
developed  for  this  purpose. 

14.4.3.2  Evaluate  System  Output.  The  output  from  the  various  GMC  monitors 
must  be  analyzed. 

14.4.3.3  Follow  Tuning  Procedures.  The  user  must  develop  his  own  skill  in 
developing  provide  specific  procedures  to  test  hypotheses  by  examining  the 
reports  produced  by  the  GMC  monitors.  Specific  system  tuning  steps  can 
result  from  the  tests. 

14.4.3.4  Evaluate  Need  to  Continue  the  Tuning.  This  decision  is  made  after 
the  relevant  recommendations  have  been  implemented.  If  the  current 
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objective  (specified  in  the  last  activity  of  the  Problem  Definition  Phase) 
has  not  been  met,  the  user  analysis  can  define  areas  for  further 
investigation. 

14.5  Composition  of  a  Performance  Evaluation  Team 

While  this  section  has  been  developed  to  aid  WVMCCS  ADP  Managers  in  the 
application  of  various  computer  performance  evaluation  tools  and  techniques, 
the  actual  team  conducting  this  analysis  must  have  a  reasonable  education 
and  experience  level  on  the  H6000.  The  team  should  be  comprised  of  at  least 
one  individual  familiar  with  system  software  and  its  operation;  one 
individual  generally  familiar  with  system  hardware  (at  least  in  terms  of 
functionality);  one  individual  familiar  with  the  procedures  for  executing 
jobs  on  the  H6000  and  at  least  one  individual  (probably  a  manager)  who  is 
familiar  with  the  objectives  and  priorities  of  the  installation.  It 
certainly  is  possible  that  one  individual  may  be  able  to  handle  several  of 
the  above  functions. 

14.6  System  Evaluation 

14.6.1  Introduction.  A  total  system  evaluation  should  be  performed  at 
least  once  a  year.  It  may  have  to  be  performed  more  often,  depending  upon 
the  results  of  daily  resource  statistics.  A  steady  increase  in  resource 
utilization  would  possible  imply  the  need  for  an  evaluation.  The  Resource 
Monitor  (RMON)  of  GMF  should  be  used  to  track  daily  resource  utilization. 

The  data  collected  for  the  detailed  system  evaluation  must  be  representative 
(i.e.,  typical)  of  the  data  that  would  be  collected  on  a  "normal"  day.  This 
requires  the  collection  of  data  over  several  time  periods.  The  suggested 
schedule  for  collecting  data  for  a  total  system  analysis  is  to  run  the  GMC 
monitors  intermittently  for  two  weeks.  The  monitors  should  be  run  as  two 
separate  groups,  on  alternate  days.  Group  one  would  consist  of  the  Memory 
Utilization  Monitor  (MUM)  (possibly  the  Idle  Monitor  (IDLEM)),  CPU  Monitor 
(CFUM),  Communications  Analysis  Monitor  (CAM),  and  the  GRTS  Monitor  (GRTM). 
Group  two  would  consist  of  the  Mass  Storage  Monitor  (MSM),  Channel  Monitor 
(CM),  and  possibly  the  Idle  Monitor  ^IDLEM).  This  sequence  provides  a  good 
representative  sample  of  the  system  workoad.  To  limit  the  amount  of  data 
collected,  it  is  advisable  to  run  Group  Two  only  during  prime  time,  heavy 
usage  (4-6  hours  per  day).  It  may  also  be  desirable  to  run  the  combined  set 
of  monitors  at  least  once  per  week  during  the  two-week  monitoring  session. 
This  allows  the  analyst  to  get  a  complete,  unified  picture  of  the  entire 
performance  of  the  system.  It  must  be  realized  that  a  large  number  of  tapes 
can  be  generated  under  such  a  monitor  configuration.  Therefore,  data 
collection  should  be  limited  to  no  more  than  four  hours  (during  heavy  prime 
time  usage). 


14.6.2  Selecting  a  Representative  Value  From  GMC  Histogram  Reports.  Some 
test  procedures  require  an  analyst  to  pick  a  "Representative  Value"  to 
describe  a  frequency  distribution.  The  following  paragraph  gives  guidelines 


for  choosing  Representative  Values  based  on  the  type  of  distribution 
observed. 

Figure  14-3  shows  a  hypothetical  "Total  Elapsed  Time  an  Activity  Was  in 
Memory"  report.  Variations  of  this  report  will  be  used  to  illustrate  types 
of  distributions.  This  chapter  will  reference  only  th«  pictorial  part  of 
the  report.  All  other  parts  of  the  histogram  report  are  described  in 
Chapter  6  of  this  document. 

14.6.2.1  Symmetric  Distribution  Closely  Clustered  Around  a  Single  Point. 

The  "Representative  Value"  for  the  distribution  shown  in  Figure  14-3  is  easy 
to  select.  The  values  are  clustered  around  the  "Average":  17.6  tenths  of  a 
second,  or  1.76  seconds.  The  shape  of  the  distribution  is  symmetric  — 
about  the  same  number  of  activities  had  values  over  1.76  seconds  as  under 
1.76  seconds,  and  the  standard  deviation  is  small  when  compared  to  the 
average.  The  absence  of  a  second  line  under  the  "Entries  Total”  line 
indicates  that  no  activities  stayed  in  memory  longer  than  2.4  seconds.  When 
a  distribution  resembles  this  example,  use  the  "Average”  printed  at  the 
bottom  as  the  Representative  Value. 

14.6.2.2  Skewed  Distribution.  Figure  14-4  shows  another  distribution. 

This  distribution- is  ^skewed"  (i.e.,  not  symmetric),  because  most  of  the 
activities  spent  around  0.1  to  0.7  seconds  in  memory,  while  some  spent  as 
much  as  four  or  five  seconds.  Care  should  be  taken  when  selecting  a 
"Representative  Value"  from  this  distribution.  If  the  analyst  wants  to 
emphasize  the  "typical"  activity,  which  stayed  in  memory  0.3  seconds  or 
less,  he  could  select  the  "median"  (the  value  which  evenly  divides  the 
activities  in  the  distribution  —  half  spent  less  time  in  memory,  and  half 
spent  greater  time  in  memory).  In  figure  14-4,  the  median  is  about  0.29 
seconds.  The  median  can  be  estimated  from  these  reports  by  descending  down 
the  "Cumulative"  column  (not  displayed  in  the  figure;  until  the  value  first 
exceeds  50.  The  median  falls  within  the  time  range  of  this  row. 

14.6.2.3  Distribution  With  Outliers.  There  are  some  instances  when  the 
distribution  will  also  have  some  values  that  were  too  big  to  fit  in  the 
histogram.  This  condition  will  be  indicated  by  an  additional  outpr*  line  at 
the  bottom  of  the  report.  This  line  will  indicate  the  number  of  occu  renceo 
that  were  outliers,  the  average  for  just  the  outliers,  and  the  average  for 
the  values  that  fit  into  the  report,  minus  the  outliers.  The  three 
important  factors  about  these  types  of  distributions  are:  (l)  the  amount  of 
times  they  occur;  (2)  the  percent  of  the  total  values  that  are  outliers;  (3) 
the  "in- range  average." 

If  the  percent  of  outliers  is  greater  than  10$,  the  analyst  should  use  the 
overall  average,  given  in  the  first  line  of  the  report,  for  any 
comparisons.  If  the  percentage  of  out-of-range  values  is  less  than  10?,  the 
analyst  can  use  the  "in-range  average"  value  for  his  comparisons  since  the 
effect  of  the  outliers  will  most  likely  be  minimal  on  the  total  system 
performance. 
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Figure  14-4.  Sample  Skewed  Distribution 


14.6.3  Memory  Evaluation.  The  first  step  in  a  memory  evaluation  is  to 
summarize  all  the  pertinent  information.  The  easiest  way  to  do  this  is  for 
the  user  to  create  a  table  similar  to  figure  14-5.  The  information  for  each 
column  is  collected  from  various  MUM  reports.  Once  the  chart  is  filled  out, 
the  analyst  can  then  ascertain  a  reasonable  idea  of  the  overall  memory 
status  of  the  system.  The  reports  required  to  collect  the  statistics  will 
be  discussed,  as  well  as  an  analysis  procedure.  The  current  version  of  the 
MUM  will  automatically  produce  this  table  (see  subsection  6.3.13  -  Memory 
Statistics  Report). 

14.6.3.1  Obtaining  the  Data.  The  first  step  is  for  the  user  to  examine  the 
monitor  data  collection  reports  for  the  total  time  of  the  run.  Any 
monitoring  session  of  less  than  2  hours  in  duration  should  be  discarded. 

The  Title  Page  will  indicate  the  overhead  generated  by  each  of  the  GMF 
monitors  active  during  the  data  collection  phase.  While  this  information  is 
not  used  in  this  analysis,  it  is  an  item  of  information  that  is  usually 
requested.  Columns  1  and  2  of  figure  14-5  can  be  obtained  directly  from  the 
MUM  Title  Page.  Under  the  System  Configuration  section  of  the  Title  Page, 
the  amount  of  memory  configured  on  the  system  will  appear.  This  figure 
should  be  tentatively  noted  under  column  5.  Following  this  figure,  the 
Title  Page  will  list  the  amount  of  memory  used  by  Hard  Core,  Core  Allocator, 
SSA  Cache  and  if  any  memory  releases  occurred  during  the  monitoring 
session.  All  these  functions  have  the  effect  of  reducing  the  available  user 
memory.  These  values  should  be  summed  and  tentatively  recorded  under  column 
6. 

Following  the  above  information  are  several  lines  of  statistics  concerning 
the  processing  characteristics  of  the  system.  Columns  17  and  18  may  be 
filled  from  this  information.  When  determining  the  number  of  activities 
processed  per  hour,  the  analyst  has  two  figures  available.  The  analyst  may 
choose  to  record  the  total  number  of  activities  processed  per  hour  or  he  may 
record  the  number  of  (non-system  scheduler)  activities  processed  per  hour. 
During  the  course  of  the  day  many  system  scheduler  activities  (activity  0  of 
any  batch  job)  are  processed.  These  activities  are  not  really  user 
generated,  but  rather  are  generated  from  the  system.  Therefore,  by  removing 
these  activities  from  the  total  activities  processed,  a  more  realistic 
figure  will  be  generated.  The  final  three  lines  of  the  title  page  can  be 
used  to  provide  the  data  for  columns  3  and  7  on  the  chart.  Column  4  is 
filled  from  Report  1,  columns  8  and  9  from  Reports  11  and  12,  columns  10  and 
11  from  Reports  16  and  17,  column  12  from  Report  19.  column  13  from  Report 
51,  column  14  from  Report  37,  column  15  from  Report  50,  and  column  16  from 
Report  36.  The  two  remaining  columns  that  need  to  be  completed  are  columns 
5  and  6.  The  System  Program  Usage  of  Memory  Report  should  be  used  to 
complete  column  6.  When  processing  the  MUM  data  reduction  program,  the  user 
should  seriously  consider  using  the  MASTER  input  option.  This  will  provide 
the  user  with  a  much  better  -indication  of  the  true  system  program  load.  In 
order  to  complete  column  6,  the  user  should  record  the  total  value  appearing 
under  the  "WEIGHTED  TTM"  column  of  the  System  Program  Usage  of  Memory 
Report.  This  value  should  then  be  added  to  the  value  already  recorded  under 


Figure  14-5.  Memory  Statistics 
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column  6.  Finally  column  5  can  be  determined  by  subtracting  the  value 
reported  in  Column  6  from  the  value  previously  reported  in  column  5. 

14.6.3.2  Evaluating  the  Data.  Figure  14-5  should  be  used  in  the  following 
manner  to  determine  if  memory  is  a  system  constraint. 

Step  1  -  If  column  7  shows  a  surplus  of  memory  greater  than  15%  of  the 
total  available  memory  or  greater  than  two  times  the  value  reported  in 
column  4,  then  the  implication  is  that  there  is  no  memory  constraint  on  the 
system.  If  Column  7  shows  a  surplus  of  memory,  but  does  not  exceed  the 
aforementioned  limits,  the  implication  is  that  the  current  system  has 
sufficient  memory,  but  that  the  system  is  approaching  memory  saturation.  If 
Colulmn  7  shows  a  shortfall  of  memory,  and  the  value  is  greater  than  15%  of 
the  total  available  memory  or  greater  than  two  times  the  value  reported  in 
column  4,  then  the  implication  is  that  memory  is  a  constraint  on  this 
system.  Finally,  if  Column  7  shows  a  shortfall  of  memory,  but  does  not 
exceed  the  aforementioned  limits,  the  implication  is  that  the  system  has 
reached  memory  saturation,  but  is  still  able  to  process  the  current  workload. 

Step  2  -  It  should  be  stressed  that  the  value  reported  in  column  7  is 
calculated  over  the  entire  measurement  period  and  therefore  could  be  biased 
by  periods  of  heavy  or  light  activity.  It  is  for  this  reason  that  the  user 
is  urged  to  run  the  monitor  during  those  periods  of  time  where  the  workload 
is  considered  to  be  heavy  and  constant.  In  order  to  determine  if  the  above 
type  of  biasing  is  occurring,  the  user  may  want  to  check  Plots  1-3.  If  it 
appears  that  there  is  a  mixing  of  light  processing  and  heavy  processing  the 
user  may  want  to  re-run  the  data  reduction  program,  using  the  time-frame 
option,  to  separate  the  heavy  processing  time. 

Step  3  -  Calculate  the  ratio  of  column  5  divided  by  column  4.  This  is 
an  indication  of  the  maximum  number  of  user  jobs  that  your  system  can 
support  at  any  one  time,  without  the  occurrence  of  significant  swapping.  If 
the  value  in  column  10  is  equal  to  or  exceeds  this  ratio,  then  the 
implication  is  that  the  system  has  reached  memory  saturation.  If  the  value 
in  Column  10  is  within  2  units  of  the  ratio,  then  the  system  probably  has 
sufficient  memory  but  is  approaching  saturation.  Finally,  if  the  value  in 
column  10  is  less  than  the  ratio  by  more  than  2  units,  the  current  system 
has  sufficient  memory. 

This  step  can  be  further  verified  by  checking  columns  14,  16  and  18  for 
indication  of  significant  swapping. 

Step  4  -  If  column  13  is  less  than  83,  the  current  system  should  have 
sufficient  memory  and  the  other  steps  should  not  be  shoving  indications  of 
memory  problems.  If  column  13  is  between  83  and  95  then  the  current  system 
is  approaching  saturation  and  at  times  may  be  showing  some  indications  of  a 
backlog.  If  the  figure  exceeds  95.  then  other  steps  should  be  indicating 
signs  of  moderate  to  severe  memory  problems. 
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It  is  possible  for  column  13  to  indicate  that  sufficient  memory  is 
available,  while  other  reports  tend  to  indicate  a  memory  contention 
problem.  This  anomoly  results  from  the  presence  of  zero  urgency  jobs.  The 
MUM  Data  Seduction  Program  discounts  zero  urgency  jobs  that  are  in  memory. 
Such  jobs  are  treated  as  if  they  did  not  exist.  The  reason  for  this  is  that 
GCOS  is  supposed  to  swap  such  jobs  out  of  memory  if  their  memory  can  be  used 
by  waiting  jobs  (see  memory  algorithm  description  in  subsection  6.3.15).  In 
reality,  however,  this  does  not  always  appear  to  happen.  Therefore,  the 
following  condition  may  arise  during  data  reduction.  Assume  that  a  site  has 
10GK  of  memory  configured  and  that  that  memory  is  currently  being  occupied 
by  2  jobs  —  each  of  50K.  However,  one  of  those  two  jobs  has  a  zero 
urgency.  The  Data  Reduction  Program  will  claim  that  memory  is  only  50 
percent  utilized.  However,  at  the  same  time,  it  is  possible  that  several 
jobs  may  be  waiting  for  memory,  because  GCOS  is  unable  or  unwilling  to  swap 
the  zero  urgency  job  from  memory.  The  existence  of  this  problem  is 
highlighted  with  the  Zero  Urgency  Job  Report  (subsection  6.3.16).  In 
addition,  there  is  a  data  reduction  input  option  (ZERO)  which  tells  the  Data 
Reduction  Program  to  include  zero  urgency  jobs  in  all  calculations. 

Step  5  -  If  column  8  is  greater  than  or  equal  to  3,  the  indication  is 
that  memory  has  become  a  constraining  factor. 

Step  6  -  If  Column  12  is  greater  than  2,  the  indication  is  that  memory 
wait  time  is  high  and  that  memory  is  probably  a  constraining  factor. 

At  this  point,  the  user  should  have  a  fairly  good  indication  as  to  whether 
or  not  memory  is  a  constraining  factor.  The  following  steps  will  indicate 
some  additional  reports  that  the  user  should  reference  to  determine  those 
jobs  that  might  be  causing  the  memory  problem. 

Step  7  -  One  of  the  largest  users  of  resources  are  jobs  that  abort  and 
then  must  be  rerun.  Aborts  usually  occur  due  to  user  errors,  but  hardware 
aborts  are  not  uncommon.  If  management  is  aware  of  aborting  jobs  and  the 
reasons  for  then,  they  can  possibly  save  substantial  system  resources.  The 
Abort  Report  is  described  in  subsection  6.3.6  and  gives  an  indication  of  the 
system  resources  being  wasted  by  aborting  jobs. 

Step  8  -  It  is  important  for  management  to  be  aware  of  jobs  that  are 
either  aisuaiig  system  resources  or  are  requestii«  large  amounts  of  system 
resources.  Upon  identifying  such  jobs,  these  Jobs  could  be  redesigned, 
scheduled  for  non-peak  processing,  or,  in  the  case  of  wasted  resources,  the 
waste  could  be  eliminated.  The  Ikcessive  Resource  Report  allows  the  user  to 
uncover  jobs  of  this  type  and  in  described  in  subsection  6.3.10.  Vhen  using 
this  report,  the  following  are  suggested  parameter  values: 
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wasted  core  -  10K 

memory  -  either  35K  (WVHCCS  standard)  or  2  times  the  value  in  column 
4 

CPU  time  -  15  minutes 
10  time  -  30  minutes 
URC  -  40 
RATIO  -  2 

Step  9  -  By  examining  the  System  Program  Usage  of  Memory  Report,  the 
user  can  determine  those  system  type  jobs  that  are  requiring  memory.  It  is 
possible  that  some  of  the  system  jobs  can  be  eliminated  or  at  least  reduced 
in  size.  This  is  especially  true  for  the  TSS.  However,  it  must  be  realized 
that  a  limitation  on  Time  Sharing  size  may  adversely  effect  TSS  response.  In 
many  cases,  if  large  file  transfers  are  being  processed  during  prime  time, 
the  size  of  the  PTS  WIN  subsystem  can  rise  to  70  or  80K.  By  not  allowing 
WIN  file  transfers  to  run  during  prime  time,  significant  memory  savings  can 
result. 

Step  10  -  As  is  explained  in  great  detail  in  subsection  6.3.15,  it  is 
vitally  important  that  the  overall  urgency  level  of  jobs  being  processed 
remain  low.  The  Distribution  of  Urgency  Report  can  be  used  to  determine  the 
overall  urgency  level  of  jobs  being  processed.  This  report  should  show  that 
60  percent  of  the  jobs  being  processed  at  any  one  time  have  an  urgency  level 
below  40  and  that  a  substantial  proportion  of  these  should  have  an  urgency 
level  between  5-10.  The  summary  at  the  bottom  should  indicate  that  75-80 
percent  of  all  activities  processed  had  an  urgency  level  below  20. 

If  this  report  indicates  a  large  percentage  of  high-urgency  jobs,  then  the 
SNUMB/IDENT  report,  or  the  Excessive  Resource  Report,  can  be  used  to 
identify  those  particular  activities  processing  with  a  high  urgency. 

Step  11  -  If  the  analyst  wants  to  track  the  memory  performance  of  a 
given  set  of  jobs,  the  use  of  the  SPECL  input  option  and  the  generation  of 
the  Special  Job  Memory  Reports  will  provide  sufficient  data  for  detailed 
memory  tracking.  This  procedure  is  especially  useful  in  analyzing  the 
memory  requirements  of  TS1,  FTS  and  the  special  JDA-developed  software 
(JDSXP,  JDSUP) .  Refer  to  subsections  6.1.24  and  6.3*14  for  complete 
descriptions  of  these  Special  Job  Memory  Reports. 

Step  12  -  Another  indication  of  poor  system  performance  possibly  caused 
by  memory  shortfall,  tape  drive  shortfall,  poor  operator  performance  or  a 
poor  system  scheduler  design  is  the  long  delay  of  jobs  as  they  pass  through 
the  various  allocation  phases  prior  to  core  allocation.  The  Allocation 
Status  Report,  the  System  Scheduler  Delay  Time  Histogram  and  the  Delay  Time 
Until  Core  Allocation  Histogram  can  all  be  used  to  determine  which  jobs,  and 
how  many  jobs,  are  being  significantly  delayed  during  the  various  allocation 
phases.  These  reports  are  all  fully  discussed  in  section  6. 


Step  13  -  The  four  tiae  plots  should  be  examined  in  order  to  determine 
if  memory  problems  occur  during  specified  times  of  day.  If  this  appears  to 
be  the  case,  then  an  adjustment  of  the  scheduler  classes,  or  manual  control 
of  the  scheduling  of  jobs,  may  alleviate  the  problem. 

Step  14  -  In  order  to  perform  a  valid  operation,  measurements  should  be 
made  over  several  days.  Figure  14-6  is  a  summary  check  sheet  that  can  be 
used  for  this  evaluation. 

Step  15  -  Memory  problems  may  also  be  occurring  as  a  result  of  jobs 
being  delayed  due  to  CPU  constraints  or  I/O  constraints.  In  these  cases, 
jobs  tend  to  sit  in  memory  due  to  a  lack  of  other  system  resources.  Because 
these  jobs  are  being  delayed,  other  jobs  cannot  enter  memory,  and  memory 
demands  begin  to  backlog.  Therefore,  if  memory  is  a  constraint,  the  user 
should  consider  conducting  a  CPU  analysis  as  veil  as  an  I/O  analysis. 

14.6.4  CPU  Evaluation.  The  CHI  evaluation  will  determine  the  general 
utilization  level  of  the  processor  and  then  determine  if  the  CPU  is 
dominated  by  GCOS  or  user  program  execution.  In  addition,  the  CPU 
evaluation  can  be  used  to  determine  if  jobs  are  being  significantly  delayed 
by  a  lack  of  processor  pover.  A  CPU  data  reduction  is  required  for  this 
evaluation.  It  is  also  beneficial  to  have  an  associated  MUM  data  reduction 
available  for  the  same  time  period. 

14.6.4.1  Data  Recording.  The  heading  page  of  the  CPU  data  reduction  report 
provides  the  dispatcher  options  currently  in  effect  on  the  system.  Decent 
tests  have  shorn  that  the  Urgency  Thruput  Option  should  be  enabled,  as  well 
as  the  In-Core  Push  Area  and  Dynamic  Buffering  of  SSA  Modules.  The  T5S  and 
the  various  WIN  subsystems  should  not  be  placed  in  Priority  B  processing. 

In  addition,  sites  should  try  to  avoid  enabling  the  I/O  Thruput  Option, 
unless  strong  justification  exists  to  decide  otherwise.  The  CPU  Time  Report 
is  produced  every  10  minutes  of  elapsed  time  and  the  data  of  interest  should 
be  found  in  the  last  10-minute  report. 

14.6.4.2  Evaluating  the  Data. 

Step  1  -  The  CHI  may  by  considered  a  bottleneck  if  the  %  Idle  CPU  column 
of  the  CPU  Time  Report  is  less  than  20  or  the  summary  column  at  the  end  of 
the  CPU  Idle  plot  indicates  that  50£  of  the  time  or  more,  the  processor  is 
less  than  25 %  idle.  It  is  a  fairly  agreed  upon  standard  that  average 
processor  utilization  should  not  exceed  8QJC.  By  maintaining  average 
processor  utilization  at  this  level,  the  processor  will  have  sufficient 
remaining  capacity  to  be  able  to  handle  those  peaks  of  processor  demand  that 
normally  occur  during  the  day.  The  three  category  breakdowns  at  the  end  of 
the  CPU  plot  represent  the  three  conditions  of  insufficient  power, 
sufficient  power,  and  excess  power  respectively. 

Step  2  -  The  %  Gate  Loop  column  of  the  CPU  Time  Report  provides  an 
indication  of  the  percentage  of  CPU  power  being  lost  because  multiple 
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Figure  14-6.  Memoxy  Evaluation  Check  Sheet  (Part  1  of  2) 
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MEMORY  EVALUATION 


_1.  Pill  out  memory  statistics  -  report  form 

_2.  Review  urgency  distribution  report  (most  user  jobs  should  be  in  the 
0-20  range) 

3.  Evaluation  of  memory  statistics: 

_ a.  Calculate  15?  of  user  memory  available 

#7*. 15  -  _  (1) 

Calculate  two  times  average  activity  size 

#6*2  -  _  (2) 

List  the  memory  surplus  or  shortfall  from  #9 

Surplus  _  (3) 

Shortfall  _  (4) 

If  (3)  >  (1)  or  (2)  OK 

<  (l)  or  (2)  ^  approaching  saturation 

(4)>(l)  or  (2)  ^  constraint 

_ b.  Refer  to  CMP  14-17,  step  2,  to  look  for  bias  conditions, 

which  may  affect  the  results  obtained  in  (a) 


_ c.  Calculate  the  ratio  of  user  memory  available  to  average 

activity  size 

#7/#6  - _  (5) 

List  the  average  number  of  user  activities  in  memory 

#12  = _  (6) 

If  (6)  +24  (5)  ->  OK 

(6)  4.  (5)  *>  approaching  saturation 

(6)  >  (5)  saturation 

Verify  by  looking  at  swap  values  in 
#16,  #19,  #22 

_ d.  Additional,  checks  for  memory  problem: 

#15  >  85 
#10  >  3 
#14  >  2 

_4.  Examine  PALC  Report,  Excessive  Report  for  consistent  offenders 

Figure  14-6.  (Part  2  of  2) 
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processors  are  interfering  with  each  other.  Within  GCOS,  there  are  many 
tables  that  must  be  updated  and/or  referenced  by  a  processor  during  its 
execution.  When  these  tables  are  being  used,  the  processor  must  insure  that 
they  are  not  altered  in  any  manner.  In  order  to  accomplish  this,  the 
processor  will  lock  a  gate.  This  locked  gate  will  prevent  any  other 
processor  from  accessing  this  table.  When  the  first  processor  has  completed 
its  use  of  the  table,  the  gate  will  be  opened.  If  a  processor  must  access  a 
gated  table,  it  will  simply  perfom  a  "CPU  loop"  at  that  table,  waiting  for 
it  to  be  opened.  The  amount  of  time  being  spent  in  this  gate  locked-CPU 
loop  code  is  depicted  by  this  column.  If  this  value  is  greater  than  5%, 
then  this  is  an  indication  that  the  multiple  processor  configuration  is 
beginning  to  lose  its  cost  effectiveness. 

Step  3  -  The  MUM  Excessive  Resource  Report  can  be  used  to  determine 
those  jobs  requiring  excessive  CPU  resources.  In  addition,  the  CPU  Plot 
report  can  be  used  to  determine  those  times  of  day  when  processor 
availability  is  in  the  critical  range.  Using  these  two  items  of 
information,  system  scheduler  classes  can  be  created  based  on  CPU 
requirements. 

Step  4  -  Another  indication  that  the  CPU  is  a  bottleneck  can  be 
determined  from  the  CPU  Queue  Length  histogram.  If  the  average  queue  length 
is  greater  than  two  times  the  number  of  processors  configured,  then  the 
processors  are  being  requested  to  handle  an  excessive  amount  of  work.  This 
queuing  of  jobs  at  the  CPU  is  an  indication  that  during  some  period  of  the 
day,  there  is  insufficient  CPU  power  to  handle  the  workload.  Once  again, 
the  CPU  Plot  can  be  used  to  determine  those  periods  of  time. 

In  addition,  at  the  bottom  of  each  10-minute  section  of  the  CPU  Time  Report, 
a  line  is  printed  indicating  the  CPU  queue  length  during  the  last  10-minute 
period,  as  well  as  since  the  start  of  the  run.  These  figures  provide  an 
excellent  indication  of  those  times  during  the  day  that  the  processor  is 
being  overloaded.  Control  of  the  scheduler  queues  is  one  method  of  limiting 
the  amount  of  work  being  entered  into  the  system. 

Step  5  -  W»e  CPU  Time  Report  and  WIN  Report  can  be  used  to  determine 
those  periods  of  time  when  TSS  and/or  WIN  programs  are  using  excessive 
amounts  of  processor  time  or,  on  the  other  hand,  do  not  appear  to  be 
requesting  sufficient  CPU  service. 

Step  6  -  If  the  Percent  of  Memory  Time  in  Queue  histogram  shows  that  the 
average  activity  is  spending  more  than  30£  of  its  memory  time  waiting  for 
the  processor,  this  is  a  strong  indication  that  processing  power  is  a 
constraining  factor.  Once  again,  in  order  to  relieve  this  constraint,  it 
will  be  necessary  to  acquire  an  additional  processor,  a  faster  processor,  or 
else  the  workload  will  need  to  be  controlled  via  the  scheduler  classes.  The 
CHJ  Access  By  SNUMB  Report  can  be  used  to  determine,  on  an 
activity-by-activity  basis,  exactly  which  activities  are  being  delayed  the 
most  due  to  the  lack  of  processor  power. 


Step  7  -  If  the  CPU  Tine  Report  shows  that  the  percent  of  system  CPU 
exceeds  30 %,  this  is  another  indication  that  the  system  software  is  being 
requested  to  handle  an  excessive  workload.  In  all  probability,  the  system 
queues  are  increasing  to  such  a  size  that  the  system  software  is  expending 
excessive  resources  managing  its  queues. 

Step  8  -  The  CPU  Time  Report  (dispatcher  queuing  portion)  indicates  the 
percentage  of  time  the  various  system  programs  spend  waiting  for  the 
processor  and  the  average  queue  position  of  these  programs.  System  programs 
should  not  spend  more  than  2$  of  the  time  waiting  for  service  and  their 
average  queue  position  should  not  exceed  2.  If  this  is  not  the  case,  the 
analyst  should  ensure  that  the  Urgency  Thruput  Option  is  enabled.  In 
addition,  the  Urgency  Report  and  Excessive  Resource  Report  of  the  Memory 
Monitor  should  be  examined  to  ensure  that  there  is  not  an  excessive  number 
of  user  activities  processing  at  very  high  urgencies  (see  section  14.6.3  for 
memory  evaluation  details) . 

Step  9  -  Figure  14-7  is  a  CPU  Evaluation  check  sheet  that  can  be  used 
for  this  evaluation. 

14.6.5  I/O  Evaluation.  The  I/O  evaluation  will  determine  whether  the  mass 
storage  subsystem,  or  tape  channel  subsystem  is  the  cause  of  system 
degradation.  This  evaluation  requires  the  user  to  have  processed  the  Mass 
Storage  Monitor  and  Channel  Monitor  data  reduction  programs.  See  section 
8.6.13  for  a  discussion  of  how  to  limit  the  processing  of  the  CM  data. 

14.6.5.1  Data  Recording.  All  output  from  the  Mass  Store  Monitor  and 
Channel  Monitor  are  required.  No  individual  work  tables  are  required,  but 
the  user  may  generate  some  if  he  feels  that  it  will  help  in  his  analysis 
(see  figure  14-8). 

14.6.5.2  Evaluating  the  Data.  Chapters  7  and  8  provide  a  fairly  detailed 
description  of  the  procedure  to  be  followed  in  analyzing  the  associated 
reports.  In  this  section,  reference  will  be  made  to  those  chapters 
indicating  actual  data  values  that  should  be  used  bb  a  reference  for 
comparison. 

Step  1  -  Read  subsection  7.3  and  subsections  8.2  and  8.3. 

Step  2  -  Check  the  crossbar  configuration  using  the  procedure  described 
in  subsection  8.2  (see  figure  14-9  for  sample  check  sheet). 

Step  3  -  Examine  the  Proportionate  Device  Utilization  Report  produced  by 
either  the  MSM  or  CM.  Check  for  devices  which  have  significantly  higher 
utilization  than  other  devices  in  the  system.  These  devices  are  potential 
bottlenecks  and  should  be  more  closely  analyzed.  It  is  desirable,  even 
though  perhaps  not  possible,  to  have  equal  utilization  across  all  disk 
packs.  Read  subsections  7.3.18  and  7.3.20  for  further  details  on  this 
step.  Once  a  pack(s)  is  identified,  further  analysis  should  be  performed  to 
detexmine  the  actual  files  being  referenced  on  the  pack  (see  step  9) . 
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CPU  EVALUTION 


__1.  Pill  out  CPU  statistics  report  form. 

2.  Review  CPU  time  report.  The  percent  of  time  in  queue  and  average 
queue  position  of  system  jobs  should  he  less  than  2. 

3.  Evaluation  of  CPU  statistics: 

_ a.  If  #5  20  or  #6  50  -  CPU  bottleneck 

_ b.  If  #9  5%  *  multiprocessor  configuration  is  losing  cost 

effectiveness 

_ c.  Calculate  number  of  processors  *2  *  _  (l) 


If  #10  (1) 


insufficient  CPU  power  to  handle  workload 


d.  If  #11  or  #12  30 


4.  Dispatcher  Options: 

_ a.  Urgency  Thruput 


insufficient  CPU  power  to  handle 
workload 


_b.  I/O  Thruput 
_c.  MCOUNT  On 
d.  Priority  B  On 


e.  In-Core  Push  Area 


_ f.  Dynamic  Buffering  of  SSA  Modules 

5.  Time  of  day  of  heaviest  CPU  usage. 

6.  Five-ten  jobs  most  affected  by  CPU  queue  wait  —  look  for  trends 
(IDEST/USERID  from  MUM) 


Figure  14-7.  CPU  Evaluation  Check  Sheet  (Part  1  of  2) 
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CPU  STATISTICS  REPORT 
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Figure  14-7.  (Part  2  of  2) 
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I/O  EVALUATION 


_1.  Run  FMS  SPUTIL  Report  and  FRAG  Report.  Refer  to  GMF  Manual  14-23, 
and  fill  In  the  disk  space  utilization  fora. 

_2.  Review  the  Channel  Monitor  Histograms  for  I/O  queue  lengths  and  I/O 
queue  times.  Note  any  devices  which  had  average  queue  lengths  1 
or  average  queue  times  15  os.  These  should  be  investigated  for 
device  contention  and  proportionate  device  utilization. 

_3.  Review  the  Channel  Monitor  System  Summary  Report. 

_ a.  Determine  the  distribution  of  connects  across  device  types: 

_ b.  Determine  the  percent  of  connects  issued  at  the  third  level 

for  each  multiple  device  channel: 


10 M# 


CHAN  # 


3rd  LEVEL  % 


1  2  3  4  5 


_4.  Fill  in  I/O  Statistics  Report  fora  from  MSM  and  CM  reports. 

5.  Evaluation: 

_ a.  If  #5  55%,  should  try  to  increase  FMS  Cache. 

_ b.  If  #9  2 5K,  then  large  subsystem  user  may  be  degrading 

TSS  response.  Review  CAM  to  verify. 

_ c.  Review  #6,  #7,  #8  to  determine  if  AST  Buffers  are  adequate: 

_ d.  If  #10  90,  SSA  Cache  needs  to  be  increased. 

_ e.  If  #11  7%,  examine  system  modules  which  need  tp  be  moved  to 

SSA  Cache. 

_6.  Review  the  Mass  Store  Monitor  Seek  Movement  Histograms  for  potential 
problem  devices 

_7.  Review  the  Mass  Store  Monitor  Head  Movement  Efficiency  Report. 
Efficiency  of  1.5  indicates  potential  problem. 

_8.  Explain  how  to  get  SSA  Cache  hit  ratio  and  FMS  Cache  hit  ratio  using 
PEEK  command  from  console. 

Figure  14-8.  I/O  Evaluation  Check  Sheet  (Part  1  of  2) 
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I/O  STATISTICS  REPORT 


#1 

#2 

#3 

#4 

#5 

START 

STOP 

FMS  CACHE 

DATE 

TIME 

TIME 

HOURS 

HIT  RATIO 

#6 

#7 

#8 

#9 

#10 

NIAST 

NIAST 

TSS  SWAP 

NIAST 

BUFF 

DELAY 

TRANSFER 

SSA  MODULE 

EFF 

SUFF 

RATIO 

SIZE  (80?) 

BUFFER  HITS 

#11 

#12 

#13 

SYSTEM 

?  WRITE 

?  CONNECTS  ACROSS 

I/O? 

VERIFY 

181  191  451  500  501 

#14  #15 

5  MOST  HEAVILY  HEAD  MOVEMENT  EFF 

USED  DEVICE  IDS  181  191  451  500  501 


Figure  14-8.  (Part  2  of  2) 
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MPC/PSIA  CHOSSBAH  CHECK 


1.  List  the  IOM  channels  on  the  crosabar  card 

2.  Prom  the  MPC  cart,  list  MPC/PSIA  for  each  channel 

3.  Problem  areas: 

a.  Same  MPC/PSIA  appears  consecutively  (logical  channel  problem) 

_ b.  same  MPC  number  and  PSIA  0,1  and  2,3  appear  consecutively 

(link  adapter  conflict) 


IOM/CHANJIEL 


MPC/PSIA 


Figure  14-9.  Crosabar  Check  Sheet 
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Step  4  -  The  histogram  displaying  Data  Transfer  Sizes  for  TSS  Swap  Files 
can  give  a  strong  indication  of  the  sizes  of  TSS  subsystems  being  used.  TSS 
subsystems  of  over  25K  can  cause  significant  increase  in  overall  TSS 
response,  especially  if  several  of  these  subsystems  are  being  executed 
simultaneously.  If  more  than  20  percent  of  the  entries  in  this  report  fall 
in  the  bucket  ranges  above  25000,  this  is  a  strong  indication  that  TSS 
response  might  be  a  problem.  This  problem  can  be  further  confirmed  with  the 
CAM. 


Step  5  -  Seek  Elongation  -  Subsections  7.5.6,  7.5.7  and  7.5.8  describe 
in  detail  the  procedures  used  to  investigate  seek  elongation  problems.  An 
average  seek  of  over  50  cylinders  for  DSS191s  and  100  cylinders  for  DSU450s 
and  DSU500s  should  be  considered  significant.  If  seek  elongation  problems 
are  discovered,  further  analysis  should  be  performed  to  determine  the  actual 
files  being  referenced  on  the  pack  (see  step  9). 

Step  6  -  Analyze  the  Channel  Monitor  Idle  Report.  This  report  can  be 
generated  only  if  the  Idle  Monitor  was  run  in  conjunction  with  the  Channel 
Monitor.  If  the  "5?  of  Idle  Time  During  Which  I/O  Was  Active"  value  exceeds 
25!?,  then  substantial  benefit  may  be  obtained  by  eliminating  I/O 
contention.  The  above  value  is  an  indication  that  even  though  the  CPU  is 
going  idle  (i.e.,  has  no  useful  work  to  perform)  there  really  is  potential 
CPU  work  available.  However,  under  current  conditions,  this  potential  CPU 
work  is  being  delayed  because  of  I/O  contention. 

Even  though  the  above  figure  exceeds  25^,  the  system  may  not  have  sufficient 
CPU  power  available  to  handle  the  increased  work  generated  by  removing  the 
I/O  contention.  Therefore,  the  analyst  should  also  check  that  the  "Average 
System  5?  Idle"  figure  exceeds  155?.  If  this  proves  to  be  the  case,  then 
removal  of  any  I/O  contention  should  prove  beneficial.  On  the  other  hand, 
if  the  figure  is  lower  than  155?,  then  removal  of  any  I/O  contention  will 
probably  result  in  additional  CPU  contention.  The  Idle  Report  will  also 
indicate  those  devices  causing  most  of  the  contention.  Make  a  record  of  the 
device  numbers. 

Step  7  -  Examine  Channel  Monitor  reports  for  I/O  queue  time  problems. 

In  performing  this  step,  the  following  reports  should  be  used: 

a.  Channel  Busy  and  Device  Busy  Report.  This  report  would  indicate 
that  both  a  channel  shortage  and  a  device  contention  problem  exist.  In 
order  to  alleviate  the  problem,  files  would  not  only  need  to  be  removed 
from  a  given  device,  but.  also  moved  to  a  new  device  located  under  a 
different  channel  subsystem.  If  additional  channel  power  was  acquired, 
then  files  could  be  removed  from  the  device  exhibiting  contention  and 
moved  to  another  device  under  the  same  channel  subsystem.  An  entry 
should  be  considered  a  candidate  for  further  analysis  if  more  than  20 
percent  of  all  its  accesses  are  queued. 
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b.  Channel  Busy  and  Device  Free  Report.  This  report  would  indicate 
that  accesses  to  a  given  pack  are  being  delayed  because  of  a  lack  of 
channel  power.  The  solution  to  this  problem  is  to  increase  the  channel 
capacity  of  the  system,  or  else  move  a  significant  number  of  files  to 
devices  located  under  a  different  channel  subsystem.  Any  device  which 
has  over  20  percent  of  its  accesses  being  delayed  because  of  channel 
contention  should  be  considered  a  candidate  for  further  analysis. 

c.  Channel  Free  and  Device  Busy  Report.  This  report  would  indicate 
that  a  given  device  is  a  potential  bottleneck  if  over  20  percent  of  its 
accesses  are  being  queued.  The  solution  to  this  problem  is  a  relocation 
of  files  off  of  the  device  in  question. 

d.  Device  free  But  Has  a  Queue  Report.  This  report  shows  the  number  of 
times  a  connect  was  made  to  a  device,  the  device  was  free  (no  active 
connect  was  in  process) ,  but  yet  there  were  outstanding  requests  waiting 
for  the  device.  This  report  is  another  indication  of  a  channel 
contention  problem  (similar  to  Channel  Busy  and  Device  Free  Report). 

The  analyst  should  refer  to  subsection  8. 5. 6. 9  to  see  the  differences 
between  the  two  reports.  An  entry  should  be  considered  significant  if 
over  20  percent  of  its  accesses  are  delayed  in  such  a  manner. 

e.  300  Disk  Drive  Report.  The  analyst  should  refer  to  subsections 
7.5.7  and  8.5.6.10  for  an  explanation  of  the  physical  characteristics  of 
a  500  device  and  the  information  to  be  obtained  from  this  report. 

Step  8  -  If  certain  devices  have  been  determined  as  bottlenecks  under 
the  procedures  described  in  step  7,  the  Job  Conflict  Report  should  be 
obtained  for  those  devices  following  the  procedures  described  in  subsection 
8.5.12. 


Step  9  -  Execute  the  Mass  Store  Monitor  Data  Reduction  Program. 

Following  the  procedures  described  below,  the  analyst  should  be  able  to 
determine  the  exact  files  that  are  causing  the  contention  found  under  the 
earlier  steps: 

a.  Review  the  System  File  Use  Summary  Report  and  the  Individual  Module 
Activity  Report  (subsections  7.5.9  and  7.5.10)  to  determine  whether 
system  files,  SYSOUT  files,  accounting  files,  or  TSS  swap  files  are 
located  on  the  device  in  question.  If  accesses  to  these  files  are  a 
significant  percentage  (20  percent)  of  all  accesses  to  the  device,  then 
relocation  of  these  system  files  may  alleviate  the  problem. 

b.  Using  the  procedures  described  in  subsections  7.5.12,  7.5.13,  7.5.16 
and  7.6.1,  determine  all  files  that  were  accessed  on  the  devices  in 
question. 
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Step  10  -  Using  the  CM,  it  is  possible  to  perfonn  a  detailed  analysis  on 
channel  queuing  for  a  particular  job.  Details  for  this  procedure  can  be 
found  in  subsections  8.5.13,  8.5.14  and  8,6.10. 

Step  11  -  This  step  outlines  procedures  for  relocating  files  identified 
as  candidates  for  file  relocation.  Because  of  automatic  load-leveling 
activity  by  the  GCOS  operating  system,  an  analyst  has  only  limited 
flexibility  for  the  placement  of  system,  permanent,  and  temporary  files: 

a.  System  Biles.  The  device  name  on  which  a  system  file  is  to  be 
placed  can  be  specified  at  system  startup.  Care  should  be  taken  to 
insure  that  multiple  high-used  system  files  are  not  placed  on  the  same 
disk  device.  If  possible,  separate  high-use  system  files  across  disk 
subsystems.  In  addition,  ensure  that  SSA  Cache  memory  and  FMS  cache  are 
enabled  to  reduce  disk  I/O  activity  to  certain  system  files.  Details 
for  this  analysis  can  be  found  in  subsections  7.5.9,  7.5.10  and  7.5.19. 

b.  Permanent  Files.  The  device  name  for  a  permanent  file  can  be 
specified  at  creation,  whether  through  FMS  or  the  ACCESS  subsystem  of 
Time  Sharing.  Files  can  be  moved  by  changing  their  names,  creating  a 
new  file  with  the  old  name,  and  moving  the  data.  The  new  file  can  be 
created  with  a  DEVICE  specification. 

c.  Temporary  Files.  The  device  name  for  a  temporary  file  can  be 
specified  in  the  second  field  of  the  $FILE  card  in  the  job  control 
deck.  Jobs  which  run  frequently  can  have  their  $FILE  cards  changed. 
Other  jobs  can  be  controlled  by  policies  governing  the  use  of  $FILE 
cards. 

Additionally,  sites  that  have  different  device  types  may  specify 
preferred  device  types  to  be  used  for  temporary  files.  This  procedure 
will  allow  activities  requiring  disk  storage  to  take  advantage  of  higher 
speed  devices. 

Step  12  -  This  step  identifies  possible  seek  contention  problems 
attributable  to  inadequate  temporary  file  space.  This  procedure  uses  the 
Disk  Fragmentation  Report  (FRAG)  available  at  most  sites.  If  such  a  report 
is  not  available,  contact  CCTC/C751.  It  is  necessary  to  analyze  temporary 
disk  capacity  on  all  disk  units  rather  than  just  the  units  identified  in 
previous  tuning  steps.  This  analysis  is  necessary  because  the  disk  units 
exhibiting  high  activity  due  to  temporary  file  use  often  have  more  available 
temporary  space.  The  increased  utilization  of  these  disk  units  may  be 
caused  by  inadequate  temporary  storage  on  other  disk  devices.  For  this 
analysis  a  form  as  shown  in  figure  14-10  may  prove  useful. 

a.  Report  Values.  The  FRAG  Device  Report  contains  the  following 
information  on  each  disk  device:  (l)  device  identification,  (2)  overall 
capacity,  (3)  available  disk  space,  (4)  disk  space  dedicated  to 
permanent  storage,  (5)  number  of  disk  fragments,  (6)  average  fragment 
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Figure  14-10.  Temporary  Storage  Test  Form 


size,  (7)  maximum  fragment  size,  (8)  percentage  distribution  of 
fragments  by  size,  and  (9)  total  fragmented  space. 

b.  Pom  Entry.  Enter  the  device  identification  for  each  disk  device  on 
the  Temporary  Storage  Test  Pom.  For  each  disk  device  enter:  (l)  the 
LLINK  capacity  column;  (2)  the  temporary  available;  (3)  the  available 
percent;  (4)  the  number  of  fragments  in  the  Number  of  Fragments  column; 
(5)  the  maximum  fragment  size  in  the  Max  Size  column;  (6)  the  percentage 
of  llinks  (l-12  and  13-120,  respectively)  in  the  Percent  Fragments 
column. 

c.  Calculations.  Add  the  Percent  Llinks  1-12  and  the  Percent  Llinks 
120  columns  and  place  the  sum  in  the  Total  Percent  column. 

d.  Decision.  Place  a  check  mark  in  the  "Ratio"  column  if  the  available 
percent  is  less  than  16  percent.  Place  a  check  mark  in  the  "Fragment” 
column  if  (l)  the  number  of  fragments  is  100,  (2)  the  maximum  size  of  a 
fragment  is  2000,  and  (3)  the  total  percentage  of  fragment  llinks  less 
than  121  LLINKS  (10  LINKS  or  less)  is  40. 

Step  13  -  If  devices  have  been  checked  in  Step  6,  then  the  temporary 
file  space  on  these  devices  is  constrained  by  either  (l)  insufficient  file 
space  or  (2)  disk  fragmentation. 

If  a  large  number  of  devices  need  to  be  checked,  then  overall  temporary  disk 
space  availability  may  be  a  constraint  to  system  performance.  At  this  point 
a  site  should  either  institute  procedures  to  recover  permanent  disk  space 
(purging  of  unused  files)  or  re-evaluate  disk  capacity  relative  to  system 
workload. 

If  certain  disk  devices  have  been  checked  because  of  their  temporary  to 
total  ratio,  or  if  many  devices  have  been  checked  because  of  disk 
fragmentation,  then  the  following  procedures  to  balance  the  file  system 
should  be  instituted. 

a.  Fragmentation.  Temporary  disk  space  may  be  compacted  by  a  full 
restore  of  the  file  system  (cold  boot).  The  Disk  Fragmentation  Report, 
run  monthly,  could  help  determine  the  frequency  at  which  a  full  file 
system  restore  would  be  necessary. 

b.  Amount  of  Available  Space.  The  full  file  system  restore 
redistributes  permanent  files  to  equalize  the  amount  of  temporary 
storage  on  each  pack.  The  FMS  SPUTIL  Report  should  be  examined  after 
the  restore  to  verify  that  disk  units  have  been  balanced.  Temporary 
disk  space  may  still  not  be  evenly  distributed  if  (l)  users  have 
specified  specific  devices  for  permanent  files  or  (2)  unusually  large 
files  are  present  on  certain  disk  units.  Consider  such  factors  as 
intentional  file  placement  for  maximum  seek  activity  and  operational 
requirements  of  data  bases  before  moving  permanent  files. 
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14.6.5*3  Mas a  Storage  Operation.  The  following  article  appeared  in  the 
September  1981  WMCCS  HtSOOO  World  Magasine.  It  presents  an  excellent 
description  of  disk  operation  and  some  problem  areas  that  a  system  analyst 
should  be  aware  of. 

"We  won' t  discuss  the  gyrations  that  GCOS  performs  in  setting  up  an  I/O 
request;  e.g.,  management  of  PAT  pointers,  PAT  bodies,  etc.  (Ve  recommend 
the  HIS  CCOS  Analysis  course).  Our  interest  begins  at  the  point  when  an  I/O 
request  generated  by  some  workload  element  is  presented  to  the  I/O 
Supervisor  (IOS)  for  service.  We  don't  care  whether  the  I/O  came  via  GFRC 
or  directly  from  a  MME  GEINOS.  Ve  do  care  about  the  major  logical/physical 
events  from  this  point  to  I/O  termination  which  are  important  to 
understanding  system  performance. 

"As  an  example,  let's  assume  that  we  have  a  dual-channel  micro-programmable 
controller  (MPC)  cross-barred  to  its  disk  devices.  Dual-channel  means  that 
the  MPC  has  two  physical  data  paths  for  transferring  data  to  the  IOM. 
Cross-barred  to  the  device  means  that  there  are  two  physical  paths  from  the 
MPC  to  each  of  its  devices. 

"Let's  also  assume  that  the  IOM  is  cross-barred  at  two-to-one.  IOM 
cross-barring  refers  to  the  logical-physical  hardware  channel  relationship 
of  the  IOM.  This  relationship  is  defined  to  GCOS  in  the  STARTUP  deck  and 
maintained  in  the  Secondary  Configuration  Table  and  elsewhere.  For  the 
example,  assume  that  PUB  addresses  0-8  and  0-9  are  associated  with  one 
physical  MPC  channel,  and  that  0-10  and  0-11  are  associated  with  the  second 
MPC  channel  as  is  shown  below: 


IOM-0 

8  9  10  lT 


Figure  1 
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"Defined  in  the  STARTUP  deck  ia  the  primaiy  PUB  address  for  the  disk 
subsystem  and  the  order  in  which  logical  paths  will  be  assigned  by  GCOS 
during  processing.  The  card 

$  IOM-O  PUB-8, MS 0450, UHITS-6, 

$  ETC  UNIT-1, ST1 


defines  PUB  0-8  as  the  primaiy  PUB  address  for  a  MSS451  subsystem.  The  card 

$  XBAR  IOM-O, PUB-8,  PUB-10,  PUB-9,  PUB-11 

defines  the  order  in  which  PUB  assignments  will  be  made.  This  information 
is  maintained  by  GCOS  in  the  I/O  stream  table.  All  requests  for  a  device 
are  made  with  its  primaiy  PUB  address,  e.g.,  an  I/O  request  for  device  6  on 
this  subsystem  would  request  IOM-PUB-DEVICE  *  0-8-6. 

"Suppose  that  IOS  receives  a  request  for  I/O  to  0-8-6.  If  the  device  is 
currently  performing  an  I/O  (i.e.,  busy),  the  I/O  request  is  queued  for 
later  service.  If  the  device  is  free,  IOS  will  assign  a  logical  channel  to 
the  request  if  one  is  available.  It  will  attempt  to  assign  PUB  0-8.  If  0-8 
is  busy,  it  will  attempt  to  assign  0-10,  etc.,  in  the  order  prescribed  on  the 
$  XBAR  card.  If  no  logical  channel  ia  available,  the  I/O  request  will  be 
queued  for  later  service  as  before. 

"To  recap,  for  an  I/O  request  to  become  active  (in-service),  both  the 
requested  device  and  a  logical  channel  associated  with  the  MPC  which 
controls  the  device  must  be  free.  Otherwise  the  request  is  placed  in  the 
I/O  queue.  Normally  the  request  will  be  placed  at  the  end  of  the  queue; 
i.e.,  behind  earlier  I/O  requests.  However,  certain  I/O  requests  are  linked 
in  front  of  the  queue,  e.g.,  TSS  requests  for  access  to  its  swap  files  (#S, 
#T,  #U,  #V).  This  can  impact  system  performance  and  will  be  discussed  later. 

"Whether  or  not  our  request  for  I/O  to  0-8-6  becomes  active,  IOS  will  scan 
the  I/O  queue  to  determine  if  any  earlier  I/O  requests  can  now  be  activated 
(device  and  logical  channel  free).  When  an  I/O  request  can  be  activated, 

IOS  loads  the  mailbox  in  Hard  Core  Memory  for  the  logical  channel  and 
signals  the  IOM.  This  mailbox  contains  information  required  by  the  IOM  to 
process  the  I/O  independently  of  the  mainframe  processor.  The  IOM  loads  the 
mailbox  information  into  its  scratch  pad  memory  and  begins  processing  the 
physical  I/O. 

"Thus  far,  the  events  which  have  occurred  were  independent  of  the  type  of 
disk  device  involved.  The  three  phases  of  physical  delay  for  a  disk  device 
are  seek  time,  rotational  latency,  and  data  transfer.  The  IOM  issues  a  seek 
request  to  the  MPC  for  0-8-6,  but  now  performance  implications  arise  which 
depend  both  on  the  physical  characteristics  of  the  device,  and  on 
device-dependent  interactions  with  IOS  which  might  be  required. 
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"First  let's  look  st  seek  time,  defined  as  the  time  required  to  move  the 
rsad/wzite  heads  from  their  current  position  to  the  cylinder  requested  hy 
the  current  I/O.  For  a  given  device,  seek  time  is  proportional  to  the 
number  of  cylinders  traversed.  Table  1  gives  seek  times  for  H6000  disk 
devices.  The  table  provides  sufficient  information  to  derive  the  average 
seek  time  for  a  device  from  the  average  number  of  cylinders  moved  (available 
from  MSN). 


Device 

DSS181 

DSS191 

MSU450 

Minimum 

10  ms 

10  ms 

8  ms 

Average 

34  ms 

30  ms 

23  ms 

Maximum 

60  ms 

55  ms 

55  ms 

Cylinders/ pack 

200 

404 

808 

Seek  Specifications 
Table  1 

"How  that  'seek'  has  been  defined,  we  can  discuss  the  motivation  of 
implementing  the  logical-physical  channel  architecture.  The  disk  subsystem 
can  have  as  many  seeks  and  data  transfers  ongoing  as  there  are  logical 
channels  assigned  to  the  subsystem.  The  number  of  simultaneous  data 
transfers  that  can  be  performed  is  limited  by  the  number  of  physical 
channels.  In  our  example,  we  could  see  two  devices  connected  to  the 
physical  channels  transferring  data  and  two  additional  devices  moving  the 
read/write  heads  to  the  requested  cylinder  (seeking).  Or,  we  could  have  one 
data  transfer  ongoing  with  three  seeks,  or  we  could  have  four  seeks  ongoing, 
etc.  So  the  purpose  of  IOM  cross-barring  is  to  increase  the  simultaneity  or 
overlapping  of  device  seeks. 

"When  the  seek  to  the  requested  cylinder  has  been  completed,  the  device  is 
ready  to  transmit  or  receive  data.  Here  again  a  device-dependent 
performance  consideration  comes  into  play.  For  a  DSS191  or  MSU450  disk 
subsystem,  once  IOS  has  loaded  the  channel  mailboxes  and  signalled  the  ION, 
it  performs  no  further  service  to  the  I/O  request  until  an  I/O  termination 
interrupt  is  received  for  that  device.  The  determination  of  seek  completion 
and  assignment  of  the  physical  channel  for  the  data  transfer  takes  place 
completely  in  the  HPC.  Not  so  for  the  DSS181  subsystem. 

"The  DSS181  does  not  provide  a  seek  complete  interrupt,  and  so  IOS  must 
check  for  a  seek  complete  status  each  time  it  gets  control  for  an  I/O 
request  as  described  above  or  for  servicing  an  I/O  termination  interrupt. 
This  introduces  another  delay  for  the  181s  from  the  time  seek  completes 


14-40 


CH-7 


until  the  next  tine  10S  is  activated.  When  IOS  detects  seek  complete  for  a 
181,  it  must  issue  another  command  to  connect  the  physical  channel  and  begin 
data  transfer. 

"If  both  of  our  physical  channels  are  busy  transferring  data,  there  is 
another  delay  until  a  physical  channel  is  free.  But  let's  suppose  the  seek 
has  completed,  a  physical  channel  is  available  for  connecting  to  the  device 
(whether  by  the  MPC  or  by  IOS)  and  we  are  ready  to  transfer  data.  Another 
delay  is  imposed  now  due  to  the  rotational  latency  of  the  device.  This  is 
the  time  from  physical  channel  assignment  until  the  requested  sector  rotates 
to  the  read/write  heads.  This  delay  is  a  function  of  the  rotational 
velocity  of  the  device.  The  minimum  latency  for  a  device  is  always  0.  (The 
data  is  at  the  heads  at  channel  connect  time.)  Maximum  latency  for  a  device 
is  the  time  required  for  1  revolution  of  the  disk,  and  average  latency  is 
generally  defined  as  the  time  required  for  1/2  revolution  (see  Table  2). 


Device 

DSU181 

\ 

DS 1/191 

MSU451 

Rotational 

2400  RPM 

3600  RPM 

3600  RPM 

Speed 

40  RPS 

60  RPS 

60  RPS 

Avg.  Latency 

12.3  ms 

8.3  ms 

8.3  ms 

Max.  Latency 

23  ms 

16.7  ms 

16.7  ms 

Rotational  Latency 
Table  2 

"A  word  about  Rotational  Positional  Sensing  (RPS).  RPS  is  a  feature 
available  on  some  disk  subsystems  —  a  required  option  (?)  on  MSU451s  ~ 
which  keeps  the  MPC  informed  about  the  rotational  position  of  each  disk.  In 
a  situation  where  the  physical  channel  is  free  and  more  than  one  device  has 
completed  it's  seek,  the  MPC  can  assign  the  physical  channel  to  the  unit 
whose  requested  sector  is  closest  to  the  heads.  The  idea  is  to  reduce 
rotational  latency  thereby  increasing  the  physical  channel  time  which  can  be 
utilized  for  data  transfer.  What  does  RPS  buy  you  if  only  one  disk  unit  has 
completed  its  seek  when  the  physical  channel  is  assigned?  Not  a  thing. 

"Now  that  device  0-6-6  has  been  assigned  the  physical  channel  and  the 
requested  sector  is  under  the  heads,  the  physical  transfer  of  data  begins. 
The  time  to  complete  the  transfer  of  data  depends  on  the  number  of  words  to 
be  transferred,  the  rotational  velocity  of  the  device,  and  the  recording 
density  of  the  media.  Data  transfer  rates  per  second  are  given  in  Table  3. 
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Device 

DSU181 

DSU191 

MSU451 

Transfer 

416KC 

1074KC 

1074KC 

Rate 

69.33KW 

.t’9KW 

179KW 

KC  -  1000  - 
KW  -  1000  - 

6  BIT  CHARACTERS 

35  BIT  WORDS 

Data  Transfer  Rates 
Table  3 

"When  data  transfer  has  completed,  an  I/O  terminate  interrupt  is  issued  to 
the  H6000  CPU  (CPU  0).  The  GCOS  dispatcher  suspends  execution  of  the 
program  in  operation,  performs  a  few  housekeeping  duties  and  then  dispatches 
CPU  0  to  the  disk  channel  module  in  IOS.  A  bit  more  housekeeping  and,  for 
our  purposes,  the  I/O  to  0-8-6  is  complete.  IOS  then  checks  the  I/O  queue 
for  seeks  to  start  —  note  that  at  this  point  it  has  a  free  logical  channel, 
and  so  can  start  at  least  one  seek  if  an  I/O  request  is  pending  against  a 
free  device  on  this  subsystem.  If  a  DSS181  subsystem  is  present,  it  must 
also  check  for  a  seek  complete  status  as  described  above.  When  IOS  can 
perform  no  more  actions,  it  relinquishes  control  to  the  dispatcher. 

"In  terms  of  capacity,  a  DSU1191  can  hold  as  much  data  as  approximately  4.3 
DSU181  packs  (see  Table  4).  A  MSU451  can  store  roughly  2  DSU191  packs  or 


8.6  DSU181  packs  worth  of  data.  In 
and  451s  are  about  2.6  times  faster 

terms  of  data  transfer  rates, 
than  181s. 

the  191s 

Device 

DSU181 

DSU191 

MSU451 

Char,  per  pack 

27.6mc 

117. 9mc 

235.  anc 

Device  Capacities 
Table  4 


Here  are  a  few  WATCHITs. 

"WATCHIT  1:  When  replacing  181s  with  451s.  The  vast  capacity  of  the  451s 
make  it  feasible  to  replace  up  to  8  181s  with  a  single  451»  In  a  situation 
where  4  I/Os  were  pending  to  files  residing  on  4  different  181s  (our  2  to  1 
cross-barred  dual  channel  example),  as  opposed  to  the  451»  the  181 
configuration  might  be  preferable.  Although  the  451  is  2.6  times  as  fast, 
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we  would  sacrifice  overlapping  of  the  seeks,  and  the  181  subsystem  can 
transfer  data  from  2  of  the  devices  in  parallel.  The  CM  Device  I/O  Queue 
Length  Histrogram  and  the  Channel  Free-Device  Busy  Report  can  indicate  if  a 
device  contention  problem  exists. 

"WATCHIT  2:  When  pondering  vendor  specs  for  181s.  Ve  modified  the  MSM  data 
reduction  program  to  obtain  the  average  data  transfer  size  to  each  disk 
device.  MSM  also  provides  the  average  cylinders  seeked  per  I/O  for  each 
device.  Assuming  the  average  rotational  latency,  one  should  be  able  to  use 
the  vendor  specs  provided  in  this  article  to  compute  the  I/O  service  time 
for  a  given  device.  The  I/O  service  time  for  each  device  is  provided  as  a 
CM  report.  For  MSU451s  the  computed  value  matches  the  CM  report  to  within  a 
few  milliseconds.  However  the  181  computed  and  monitored  values  differ  by 
as  much  as  30bs  and  more  —  the  computed  value  always  being  less.  Since 
channel  utilization  is  fairly  low,  the  most  likely  reason  for  this 
discrepancy  is  that  IOS  latency  time  we  discussed  earlier. 

"WATCHIT  3 i  When  placing  TSS  swap  files  (#S,  #T,  #U,  #V).  We  mentioned 
earlier  that  TSS  requests  for  access  to  it's  swap  files  were  linked  to  the 
front  of  the  I/O  queue.  TSS  is  not  one  to  wait  on  user  I/Os,  nor  does  it 
break  it’s  swap  action  into  nice  little  320-word  standard  system  format 
I/Os.  It's  gonna  swap  the  whole  thing.  Say,  there's  a  nice  little  TSS 
COBOL  sub¬ 
system  —  only  4QKI  Time  to  swap?  So  soon?  Oh  yeah,  that  memory-time 
quantum.  Lot  of  memory  —  not  much  time.  Oh  well,  a  40K  transfer  is  OK.  My 
swap  files  are  on  fast  451s.  Friend,  you  just  tied  up  that  device  for 
almost  a  quarter  of  a  second,  and  more  importantly,  a  physical  channel  as 
well.  That's  only  half  of  it.  TSS  swapped  it  out,  you  can  bet  it'll  swap 
it  back  in.  So  we're  up  to  .5  sec  of  physical  channel  time  to  free  up  some 
TSS  user  memory.  This  scenario  impacts  total  system  throughput,  not  just 
TSS.  That's  not  all  the  havoc  that  larger  user  subsystems  create,  but  that 
is  enough  heartburn  for  now." 

14.6.6  Communication  Evaluation.  The  communication  evaluation  will 
determine  the  overall  terminal  usage  of  a  system.  It  can  also  be  used  to 
examine  the  DN355  usage.  This  evaluation  can  be  done  using  either  the  CAM 
or  the  CRTS  monitor,  or  both.  Figures  14-11  through  14-14  are  sample  table 
formats  that  may  be  used  to  display  the  gathered  data. 

14.6.6.1  Data  Recording.  For  figure  14-11,  the  Terminal  Session  Report  is 
used.  All  users  with  TSS  subsystems  of  35K  are  recorded.  Column  7  is 
obtained  from  scanning  the  Excess  Think/Response  Time  Report.  Figure  14-12 
is  obtained  from  the  Response  Time  Report.  All  periods  of  time  when  the 
response  for  TSS  is  greater  than  15  seconds  are  recorded.  Figure  14-13  is 
obtained  directly  from  the  High  Terminal  Usage  Report.  All  DAC  terminals 
with  over  90  percent  usage,  except  WIN  lines,  are  recorded. 

Figure  14-14  is  obtained  from  the  H6000-DN355  Reject  Report  and  the  .Abort 
Report.  The  H6000-DN355  Reject  Report  is  used  if  the  average  number  of 
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Figure  14-12.  Poor  TSS  Response  Log 


reject  commands  per  hour  of  running  time  is  greater  than  50  (total  number  of 
reject  commands/number  of  hours  in  run) .  Terminals  with  more  than  50 
percent  of  the  rejects  should  also  be  listed.  For  the  Network  Control 
Program  (NCP)  disconnects,  all  NCP  01  line  disconnects  listed  in  the  Abort 
Report  should  be  tallied. 

14.6.6.2  Evaluating  the  Data.  The  following  procedure  should  be  followed 
in  order  to  analyze  the  data. 

Step  1  -  TSS  response  time  is  dependent  upon  certain  TSS  parameters. 

The  TS1  Initial  Parameter  Report  contains  current  settings  of  these 
parameters.  The  critical  parameters  are: 

a.  Initial  TS1  Max  Size  -  If  operators  must  increase  max  size  during 
the  day,  TSS  can  slow  its  processing  of  user  responses  until  it  can 
grow.  This  should  be  set  to  the  normal  maximum  size  TSS  reaches. 

b.  Size  Growth/Reduction  Factor  -  These  sizes  should  be  identical  or 
growth  can  be  twice  reduction. 

c.  Max  Number  of  Terminals  -  Should  be  large  enough  to  satisfy  all 
possible  users. 

d.  Large  Subsystem  Size/Vait  Time  -  The  average  subsystem  size  should 
be  less  than  35K  and  all  large  users  (greater  than  35K)  should  be 
penalized. 

e.  Number  Swap  Files  -  On  an  active  TSS  system,  should  be  four. 

f.  Allocated  Devices  -  Swap  Files  (#S,  #T,  #U,  #V)  should  be  on 
specific  devices,  as  determined  by  MSM. 

Step  2  -  Users  of  large  TSS  subsystems  cause  TSS  to  swap  other  users  out 
and  therefore  generate  extra  TSS  and  system  I/O  overhead.  If  an  excessive 
number  of  users  are  using  subsystems  of  greater  than  35K  (figure  14-11), 
these  users  should  be  queried  as  to  what  subsystem  they  are  using.  Any 
large  subsystem  that  has  high  usage  should  be  investigated  for  possible 
rewrite. 

Step  3  -  The  overall  system  and  TSS  response  time  should  be  monitored 
using  figure  14-12.  Periods  of  bad  response  should  be  checked  to  ensure  the 
bad  responses  are  not  caused  by  a  few  (less  than  10  percent  of  all  in-range 
responses)  out  of  range  responses.  If  TSS  response  is  truly  poor,  a 
correlation  between  response  and  large  subsystem  usage  should  be  attempted 
(figure  14-11  and  figure  14-12). 

Step  4  -  Terminals  logged  onto  the  system  for  long  periods  of  time 
(greater  than  75-80  percent  of  the  monitoring  session),  but  having  few 
inputs  and/or  few  inputs/outputs,  should  be  investigated  (figure  14-13). 


14-48 


CH-7 


Terminals  with  few  inputs,  but  many  outputs,  are  probably  logged  onto 
VIDEO.  The  number  of  users  on  VIDEO  should  be  restricted  to  one  or  two  per 
DATANET,  due  to  the  buffer  load  place  on  the  DATANET  by  VIDBO.  If  more 
users  are  required,  monitor  output  terminals  should  be  used  instead  of  more 
VIDEO  users.  Terminals  with  few  inputs  and  few  outputs  may  be  being  used 
just  to  keep  a  terminal  logged  on.  This  practice  causes  unnecessary  TSS 
size  and  processing  overhead. 

Step  5  -  Terminals  with  a  large  number  of  Opcode  Rejects  (figure  14-14) 
are  an  indication  of  possible  line  problems.  The  terminals  and  lines  should 
be  checked  for  noise  and  transmission  errors.  Numerous  NCP  disconnects  per 
day  (figure  14-14)  is  usually  an  indication  of  line  or  IMP  problems.  NCP 
should  have  fewer  than  five  disconnects  per  day. 

14.6.7  Timesharing  Evaluation.  The  TSS  evaluation  will  attempt  to 
determine  causes  for  poor  TSS  response  time  periods.  This  evaluation  can  be 
done  using  the  TSS  Monitor  of  GMP. 

14.6.7.1  Background.  The  GMP/TSSM  data  collection  tool  is  one  of  several 
GMP  monitors  designed  to  collect  performance  information  from  major  portions 
of  the  GCOS  operating  system  and  related  software  components.  The  TSSM  is 
designed  to  monitor  the  performance  of  the  TSS  executive  software.  One 
hundred  and  four  data  collection  points  have  been  established  within  the  TSS 
executive  code,  tracing  the  execution  of  nearly  every  major  routine.  These 
tracing  points  are  sufficient  to  allow  for  the  isolation  of  time  sharing 
response  time  problems  to,  at  most,  a  few  routines. 

The  trace  points  divide  the  processing  performed  by  the  TSS  executive  into 
several  states.  These  states  or  processing  subdivisions  allow  for  the 
nature  of  a  delay  to  be  identified.  The  states  described  are: 

a.  Eligible  for  Processor  Dispatch.  During  this  Btate,  the  user 
(subsystem)  is  not  being  processed  by  the  TSS  executive,  but  is 
under  the  control  of  the  system  dispatching  module,  .MDISP. 

b.  User  File  I/O.  During  this  state,  the  user  iB  waiting  on  the 
completion  of  a  file  I/O  operation  which  the  user  (or  his  subsystem) 
is  responsible  for  initiating. 

c.  Command  Processing.  The  user  cannot  affect  the  time  spent  during 
the  scanning,  analysis,  and  subsystem  loading  caused  by  his  entry  of 
time  sharing  commands,  but  the  overhead  incurred  due  to  these 
actions  is  directly  attributable  to  him. 

d.  Derail  Processing.  In  order  to  receive  service  from  the  TSS 
executive  (and  from  GCOS)  a  user  subsystem  uses  DRL  instructions. 
When,  during  execution,  a  DRL  instruction  is  encountered,  a  fault 
occurs,  and  the  user  is  removed  from  the  subdispatch  queue  and 
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placed  Id  a  fault  queue.  When  this  fault  queue  entry  ia  serviced, 
the  user  may  enter  the  "Derail  Processing’'  state  to  service  some 
subsystem  request.  As  this  is  the  major  cause  of  fault  queue 
entries,  this  state  will  track  the  time  required  to  service  the 
derail  request,  and  to  return  the  user  back  to  the  first  state 
defined  above. 

e.  Processor  Allocation.  During  the  processing  associated  with  this 
state,  TSS  examines  the  outstanding  requests  for  fault  queue 
service,  memory  allocation,  or  line  service.  This  state  is  ended 
when  the  allocation  routine  locates  the  correct  servicing  routine 
and  transfers  to  it. 

f.  UST  Management.  In  order  to  keep  track  of  each  of  the  TSS  users, 
the  executive  maintains  a  User  Status  Table  for  each  terminal  which 
is  active.  The  processing  which  occurs  within  this  state  involves 
the  management  of  this  list  of  UST  entries.  Entries  may  be  added, 
moved,  compressed,  released,  and  updated. 

g.  Memory  Management.  TSS  maintains  space  for  both  user  subsystems  and 
User  Status  Tables.  Allocation  of  this  space  is  made  based  on  a 
complicated  algorithm  which  uses  the  amount  of  space  required,  the 
amount  used,  and  the  priority  of  various  users.  TSS  will  adjust  its 
overall  size  based  on  the  demands  placed  upon  it,  and  within  the 
bounds  established  by  both  boot  deck  patches  and  operator  commands. 
The  periods  spent  by  TSS  within  this  logic  will  be  accounted  as 
falling  within  this  state. 

h.  Swap  Space  Management.  The  TSS  executive  maintains  its  own  swap 
files  for  user  subsystems.  The  decision  to  swap  or  unswap  a  user's 
subsystem  is  also  a  complicated  process,  and  involves  not  only  the 
demand  being  experienced,  but  also  the  time  spent  by  a  user  without 
doing  I/O.  The  periods  spent  by  the  TSS  executive  during  the 
management  of  its  swap  space  will  be  accounted  as  falling  within 
this  state. 

i.  PAT  Table  Management.  The  TSS  system  also  controls  the  PAT 
(Peripheral  Allocation  Table)  space  used  to  define  the  accesses  made 
by  users  to  temporazy  and  permanent  files.  While  the  time  spent 
during  this  processing  is  directly  attributable  to  some  user  action, 
the  importance  of  this  processing  indicates  a  need  to  discern 
between  the  time  spent  by  TSS  for  other  functions  and  that  spent 
during  the  management  of  these  PAT  Tables. 

j.  TSS  Idle.  The  time  sharing  executive  may  be  idle  due  to  an 
unsatisfied  MKE  request,  or  due  to  its  having  no  outstanding  work. 
When  either  of  these  conditions  arise,  the  program  will  relinquish 
control,  and  will  not  receive  processor  attention  until  some 
condition  changes. 
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k.  Line  Management.  TSS  employs  MMEs  to  communicate  with  the  GCOS 

modules  DNET  and  ROUT,  which  are  responsible  for  communication  with 
the  remote  terminals  used  by  TSS  users.  Traces  captured  while  in 
this  executive  state  will  define  exactly  the  response  time 
experienced  at  each  TSS  user  terminal. 

If  the  boundaries  between  these  states  can  be  traced,  and  the  time  spent 
during  a  particular  interval  when  poor  response  is  indicated  is  compared 
from  one  state  to  the  next  (as  well  as  across  common  states  during  different 
response  intervals)  the  source  of  a  poor  response  can  be  isolated,  and  the 
probable  cause  inferred.  This  is  the  most  that  can  be  gained  by  a  tool  of 
this  nature.  It  should  not  be  expected  that  the  GMF/TSSM  will  be  able  to 
pinpoint  the  exact  cause  of  all  possible  response  time  problems. 

The  data  collected  by  the  existing  GKF/TSSM  can  be  useful  towards  the  ends 
for  which  it  was  developed.  The  data  collected  by  the  one  hundred  and  four 
existing  trace  points  can  be  used  to  identify  periods  of  poor  response  time, 
and  by  separating  the  processing  performed  by  the  executive  into  the  states 
described  above,  the  source  of  these  periods  can  be  isolated  and  categorized. 

14.6.7.2  Report  Organization.  The  time  sharing  executive  is  a  complex 
system.  TSS  contains  numerous  tunable  parameters  which  can  be  adjusted  by 
site  option  patches.  When  a  response  problem  is  encountered,  the  most 
likely  solution  will  involve  some  adjustment  of  these  tunable  parameters  to 
better  match  the  workload  being  experienced  at  the  site.  Section  14.6.7.3 
of  this  report  presents  an  overview  of  the  performance  considerations 
relevant  to  terminal  response  time,  and  describes  each  of  the  tunable 
parameters  associated  with  the  time  sharing  executive. 

Before  response  time  problems  can  be  analyzed,  they  must  first  be 
identified.  While  it  is  probable  that  no  analysis  would  be  performed  if 
poor  response  were  not  already  being  noticed  by  users,  response  time 
anomalies  do  exist  which  will  not  be  noticed  by  even  the  most  observant 
user.  These  "invisible"  delays  can  sum  to  a  considerable  performance 
problem.  Subsection  14.6.7.4  describes  how  the  user  can  analyze  the  reports 
produced  by  the  TEARS  system  to  identify  both  obvious  and  subtle  response 
time  and  performance  problems. 

Once  a  poor  response  time  problem  has  been  identified,  it  must  be  analyzed 
to  determine  its  cause.  The  isolation  and  analysis  of  response  time 
problems  based  on  the  reports  produced  by  the  TEARS  system  is  discussed  in 
subsection  14.6.7.5. 

14.6.7.3  TSS  Tunable  Parameters.  Within  the  TSS  executive,  many  of  fhe 
algorithms  used  to  control  user  processes  can  be  adjusted  to  better  process 
the  particular  workload  experienced  at  a  site. 


14-51 


CH-7 


The  tunable  paraaetere  associated  with  the  Time  Sharing  System  have  been 
categorized  into  several  areas,  each  of  which  is  described  in  the  sections 
below.  The  reader  is  warned  that  the  changes  to  the  default  settings  for 
these  parameters  should  be  made  with  care. 

14.6.7.3.1  Priority  *B*  Processing  Parameters.  The  most  important  of  the 
tunable  parameters  associated  with  TSS  response  time  are  those  associated 
with  the  Priority  *B*  processing  option  located  in  the  system  dispatching 
module,  .MDISP.  These  options  will  allow  the  TSS  executive  to  be  considered 
for  dispatch  with  a  much  higher  frequency  than  the  consideration  given  to 
normal  (batch)  slaves.  CCTC/C751  does  not  recommend  the  activation  of  these 
processing  options  because  of  the  overhead  faced  by  other  (batch)  jobs  when 
they  are  active.  Activation  of  Priority  'B'  will,  for  example,  limit  the 
processor  attention  given  to  WIN  system  jobs. 

There  are  several  parameters  which  are  associated  with  Priority  'B' 
processing,  and  the  settings  for  each  should  be  carefully  considered.  Many 
WMCCS  sites  have  mistaken  these  parameters  for  switches  which  are  "turned 
on"  to  enable  Priority  'B*. 

The  site  option  patch  shown  below  will  enable  Priority  'B'  processing  for 
the  system. 


CC 

CC  CC 

CC 

1 

8  1 

7 

6 

3 

000000 

OCTAL  000001000000  BIT18-1  ENABLE  PRIORITY  B 

(#A1E) 

.MDISP 

Once  enabled,  the  dispatcher  will  search  a  list  of  priority  jobs  prior  to 
initiating  each  dispatch.  To  add  TSS  to  this  list  of  priority  jobs,  the 
following  site  option  patch  must  be  used. 

CC  CC  CC  CC 

18  1  7 
6  3 

000001  OCTAL  6245500000XY  PLACE  TSS  IN  PRIORITY  B  TABLE  (#A1X)  .MDISP 

'X'  is  equal  to  the  number  of  non-class  *B'  priority  dispatches  for  each 
dispatch  used  by  TSS.  For  example,  if  this  variable  is  set  to  one,  then  TSS 
could  receive  every  other  dispatch.  *Y*  is  equal  to  the  number  of  32 
millisecond  time  units  which  should  be  granted  on  each  dispatch  to  TSS 
before  timer  runout  faults  occur. 


Finally,  the  following  site  option  patch  is  used  in  conjunction  with  #A1X 
shown  above.  This  patch  specifies  the  processor  requests  which  will  be  made 
by  TSS. 


-w - - 

CC 

CC 

CC 

CC 

1 

8 

1 

7 

000174 

OCTAL 

6 

MMMMMMHNHHNH  .TAGPP  PRIORITY  B  WORD 

(#A0Y) 

3 

.MTIMS 

MMMMHM  is  equal  to  the  number  of  32  millisecond  time  units  required  for  each 
dispatch  to  the  TSS  executive.  NMNNNN  is  used  to  specify  the  relationship 
between  TSS  and  other  Priority  '  B*  jobs.  In  NNNNNN-1,  then  TSS  will  be 
considered  on  every  other  Priority  ’B*  dispatch.  If  NNHNNN-3,  then  TSS  will 
be  considered  on  every  fourth  Priority  'B'  dispatch. 

The  settings  for  MMMMMM  and  NNNNHH  should  be  made  in  conjunction  with 
particular  site  requirements.  It  is  recommended  that  the  number  of  32ns 
time  units  associated  with  a  dispatch  be  large  enough  to  ensure  that  TSS  is 
not  forced  to  request  a  second  dispatch  to  satisfy  all  outstanding 
requests.  The  default  value  for  MMMMHM  is  4.  The  default  value  for  NNNNNN 
is  1.  Thus,  without  this  patch,  TSS  will  request  a  4*32  ms  time  slice,  and 
will  be  considered  on  every  second  priority  ’B'  dispatch. 

14.6.7.3.2  Placement  of  TSS  Files.  The  TSS  executive  maintains  up  to  seven 
files  to  hold  pending  deferred  sessions,  user  subsystems,  and  to 
swap/push-down  active  user  memory.  These  files  can  be  allocated  each  time 
the  time  sharing  system  is  started,  or  they  may  be  declared  as  permanent. 

At  a  minimum,  it  is  recommended  that  the  swap  files  maintained  by  TSS  be 
declared  as  permanent.  By  doing  this,  the  accesses  made  to  these  files  by 
the  TSS  executive  can  be  optimized  by  avoiding  the  problems  associated  with 
disk  fragmentation. 

Most  sites  have  already  made  their  deferred  queue  file  (DQl)  permanent,  to 
allow  scheduled  deferred  sessions  to  be  saved  across  system  boots. 

In  order  to  assign  the  TSS  deferred  queue  and  swap  files  as  permanent,  the 
following  cards  should  be  placed  in  the  $EDIT  section  of  the  startup  deck. 


FILDEP 

dvc , DQl, 48 

FILDEF 

dvc,SVl,nnn 

FILDEP 

dvc ,SV2,nnn 

FILDEF 

dvc,SW3,nnn 

FILDEF 

dvc,SV4,nnn 

The  device  assignments  for  these  files  (dvc)  should  be  made  based  on  the 
following  criteria: 
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o  Different  swap  files  should  not  occur  on  the  same  device. 

o  Where  possible,  the  swap  files  should  be  adjacent  to  heavily  used 
GCOS  files,  thus  minimizing  I/O  times. 

Placement  of  the  LQ1  (deferred  queue)  file  is  not  as  critical  as  the  swap 
files,  as  it  is  not  as  heavily  accessed.  The  sizes  of  the  TSS  swap  files 
(nnn)  should  be  made  based  on  configuration  limits,  but  should  not  be  lower 
than  2260  LLINKS. 

In  addition  to  the  startup  cards  shown  above,  the  following  site  option 
patches  must  be  included.  This  patch  changes  the  default  USERID  for  TSS 
files  from  OPNSUTIL  to  GCOS?,  thus  allowing  the  permanent  deferred  queue  and 
swap  files  to  be  physically  near  other  GCOS  system  files. 

CC  CC  CC  CC 

000132  OCTAL  272346620320  DEFAULT  UID  POR  TSS  FILES  GC0S3  .MTIMS 

000362  OCTAL  202020202020  .MTIMS 

The  reader  should  note  that  the  use  of  this  patch,  which  changes  the  default 
location  for  TSS  files,  will  also  change  the  permissions  requirements  for 
the  files  associated  with  the  GMF  collector.  USERID  B29IDPX0  must  be 
given  modify  permissions  against  GC0S3/TS1,  if  B29IDPX0  is  to  be  used  to 
install  the  patches  which  enable  the  TSSM.  Also,  the  catalog/file  string 
containing  the  H*  file  for  the  collector  program  should  be  modified  to  give 
GC0S3  read  permissions. 

Finally,  the  device  placement  for  the  TSS  program  files  should  be  made  to 
avoid  contention  with  FMS.  Program  files  should  be  forced  to  occur  on  some 
device  which  does  not  contain  an  SMC  or  SMC  duplicate.  The  following  site 
option  patch  declares  the  device  to  use  for  each  of  the  TSS  program  files, 
and  associates  the  name  used  on  the  $  FILDEF  cards  described  above  with  the 
file  codes  used  by  TSS. 


CC 

CC  CC 

CC 

1 

8  1 

6 

7 

3 

000365 

OCTAL  400000626601 

SWAP  FILE  SW1 

PERMANENT 

(#A1D) 

.MTIMS 

000366 

OCTAL  400000626602 

SWAP  FILE  SW2 

PERMANENT 

(#A1D) 

.MTIMS 

000367 

OCTAL  400000626603 

SWAP  FILE  SW3 

PERMANENT 

(#A1D) 

.MTIMS 

000370 

OCTAL  400000626604 

SWAP  FILE  SW4 

PERMANENT 

(#A1D) 

.MTIMS 

000361 

OCTAL  626302 

TSS  PRIM  PGM  FIL  IN  ST2 

(#A1D) 

.MTIMS 

000362 

OCTAL  626303 

TSS  SCND  PCM  FIL  ON  ST3 

(#A1D) 

.MTIMS 
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During  TSS  initialization,  allocation  (of  the  TSS  program  files)  will  be 
attempted  on  the  device  specified  in  the  site  option  patch  described  above. 

If  allocation  is  impossible  on  the  specified  device,  a  message  will  be 
printed  on  the  system  console  and  allocation  will  be  attempted  on  another 
device.  When  this  message  begins  to  appear,  site  personnel  should  either 
schedule  a  cold  boot  or  attempt  to  depopulate  the  device  indicated  to  allow 
allocation  of  the  TSS  file. 

14.6.7.3.3  TSS  Swap  File  Processing.  If  the  reader  has  chosen  to  assign 
TSS  swap  files  as  permanent,  he  may  disregard  the  discussions  in  this 
section  which  relate  to  the  growth  and  release  of  swap  file  space.  The  size 
(as  declared  on  the  $  FILDEF  card)  for  permanent  swap  files  will  not  change. 

The  TSS  system  will  maintain  up  to  four  separate  swap  files  to  hold  user 
subsystems  which  have  been  either  swapped  or  pushed-down.  When  a  user 
subsystem  is  moved  to  these  swap  files,  TSS  will  place  it  on  the  least  busy 
of  the  active  swap  files.  The  size  of  these  swap  files  can  be  adjusted  if 
enough  room  is  not  available  on  any  one  of  them.  The  process  which  will 
grow  and  shrink  these  swap  files  can  be  adjusted  through  the  use  of  the  site 
option  patches  described  below. 

Adjustment  of  these  parameters  may  prove  necessary  if  the  data  collected  by 
the  GMF/TSSM  reveals  that  excessive  MME  GEHOREs  are  being  executed  during 
swap  file  processing.  In  addition,  data  collected  by  the  GMF/TSSM  can  be 
used  to  identify  the  optimum  number  of  swap  files  which  are  required,  based 
on  the  volume  of  swap  processing  indicated  by  the  traces  collected. 

GMF/TSSM  trace  type  2  is  generated  if  an  attempt  to  grow  a  TSS  swap  file  is 
denied.  The  continued  occurrence  of  this  type  of  trace  should  prompt  site 
personnel  to  either  increase  the  number  of  active  swap  files,  depopulate  the 
devices  used  to  hold  the  swap  files,  or  perform  a  cold  boot  to  gather 
fragmented  disk  space. 

CC  CC  CC  CC  . 

18  1  7 

6  3 

000357  OCTAL  000000000004  FOUR  SWAP  FILES  ACTIVE  (#A13)  . MTIMS 

000356  OCTAL  000454000000  MINIMUM  SWAP  FILE  GROWTH  LINKS  (#A14)  .MTIMS 
000355  OCTAL  000000002260  INITIAL  SWAP  FILE  SIZE  LLINKS(EACH) (#A15)  .MTIMS 

Computation  of  the  optimal  number  and  size  of  the  TSS  swap  files  is  not 
exact,  and  involves  consideration  of  the  following  factors  from  the  TEARS 
reports: 

o  Number  of  swaps  being  performed 

o  Number  of  force  swaps  (discussed  later  in  this  section) 
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o  Average  subsystem  sizes 

o  Duration  of  swaps 

o  Configuration  limitations  on  disk  and  core  space 
o  Humber  of  concurrent  TSS  users 

The  minimum  interval  between  considerations  to  swap  user  subroutines  can  be 
specified  through  the  modification  of  the  following  (non-site  option)  cell 
of  TSS.  The  value  specified  will  be  used  to  prevent  swap  file  processing 
from  utilizing  too  large  a  share  of  the  total  processing  time  available  to 
the  executive.  Force  swaps  may  occur  without  reference  to  this  cell,  but 
are  controlled  by  other  methods  (as  described  in  subsection  14.6.7.3.7). 


CC 

CC 

CC 

CC 

1 

8 

1 

7 

6 

3 

000172 

OCTAL  000000175000  36/lSEC*64000  CLK  PLSES  SWAP  INVL  (A.SD3I) 

.MTIMS 

The  smallest 

memory  wait  time 

before  swap  logic  is  considered  can  be 

set  to 

prevent 

swap 

logic  from  being 

invoked  for  memory  allocation  purposes 

before 

normal  subsystem  terminations  might  allow  allocation  to  be  completed. 

CC 

CC 

CC 

CC 

1 

8 

1 

7 

6 

3 

00000166  OCTAL  372000  36/2SEC*64000  CLK  PLSES  (TSAWT)  .MTIMS 

The  following  cell  is  included  in  the  TSS  executive,  but  is  not  recommended 
for  use,  as  its  ramifications  cannot  be  predicted.  If  non-zero,  the 
contents  of  this  cell  will  be  used  to  override  any  other  swap  processing 
timers,  forcing  swap  processing  to  occur  with  the  frequency  specified  in  the 
timer.  If  specified,  this  cell  would  seem  to  be  a  method  by  which  the 
number  of  user  subsystems  in  memory  could  be  minimized  by  actually  swapping 
each  subsystem  in  a  state  which  will  permit  it.  For  the  optimization  of  TSS 
response  time,  this  would  not  be  recommended.  Conversely,  if  this  cell  were 
specified  to  be  high,  swapping  would  not  occur  when  needed  for  the 
allocation  of  large  or  urgent  subsystems.  Again,  this  is  not  recommended 
for  performance. 

CC 
1 

000212 
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CC 

8 


CC 

1 

6 


OCTAL  000000000  36/MIN*60*SEC*64000  CLK  PLSES  BTWH  SWP-A.SD3C 


CC 

7 

3 

.MTIMS 


Finally,  the  following  site  option  patch  is  recommended  for  all  sites. 
Examination  of  the  operation  of  the  TSS  executive  by  officials  of  CCTC/C751 
has  revealed  that  that  the  usage  of  the  various  TSS  swap  files  is  not 
balanced.  The  patch  shown  below  will  act  to  balance  this  usage,  and  will 
minimize  the  contention  caused  by  imbalanced  usage,  thus  lowering  the  wait 
times  for  those  TSS  users  and  subsystems  which  access  them. 


CC 

CC 

CC 

CC 

1 

8 

1 

6 

7 

3 

011365 

OCTAL 

012134221200 

LDX1 

.  NXFIL  LAST  FILE  USED 

#BJS 

.MTIMS 

011367 

OCTAL 

042315604200 

TMI 

PI  RESET  INDEX 

#BJS 

.MTIMS 

011373 

OCTAL 

042317603200 

TRC 

P2  ARE  WE  DONE 

#BJS 

.MTIMS 

011376 

OCTAL 

042317710200 

TRA 

P2  ARE  WE  DONE 

#BJS 

.MTIMS 

042315 

OCTAL 

000357721200 

PI 

LXL1  .TSSF  #S¥AP  FILS 

#BJS 

.MTIMS 

042316 

OCTAL 

011366710200 

TRA  DRM050  LOOP 

#BJS 

.MTIMS 

042317 

OCTAL 

012134101200 

P2 

CMPX1  NXFIL  DONE  YET 

#BJS 

.MTIMS 

042320 

OCTAL 

011377600200 

TRA  DRM055  YES 

#BJS 

.MTIMS 

042321 

OCTAL 

011366710200 

TRA  DRM050  NO 

#BJS 

.MTIMS 

14.6.7. 

3.4  Subsystem  Accounting 

;.  The  GCOS  Statistical 

Collection 

File 

(SC F)  software  contained  in  the  TSS  executive  can  be  adjusted  to  modify  the 
frequency  with  which  TSS  statistical  records  are  written.  Site  personnel 


should  consider  the  overhead  associated  with  this  statistical  record,  and 
review  with  security  personnel  the  need  to  collect  this  information.  If  it 
is  not  used,  the  site  option  patches  described  below  can  be  used  to  disable 
its  collection  and  recording.  When  active,  a  dummy  User  Status  Table  (UST) 
is  maintained  by  TSS  for  SCF,  thus  decreasing  the  amount  of  UST  (and 
subsystem)  memory  available  to  TSS  terminal  users,  and  increasing  overhead. 


CC  CC  CC  CC 

18  1  7 

6  3 

000120  OCTAL  000000000060  COLLECT  STATS  EVERY  60  SECS  (#All)  .MTIMS 

000352  OCTAL  000000000001  TURN  ON  SUBSYSTEM  ACCNTNG  (#A1B)  .MTIMS 


On  a  lightly  or  moderately  loaded  TSS  system,  the  presence  or  absence  of 
subsystem  accounting  will  make  little  difference  to  performance.  On  a  more 
heavily  loaded  system,  however,  this  site  option  should  be  disabled  by 
removing  one  or  both  of  these  site  option  patches,  unless  the  SCF  data 
collected  is  required  by  the  site. 

14.6.7.3.5  UST  Memory  Management.  Site  personnel  can  modify  several 
parameters  which  will  affect  the  amount  of  memory  which  can  be  used  by  the 
TSS  executive  for  User  Status  Tables  (USTs).  These  USTs  control  the  number 
of  active  users  which  can  be  processed  by  the  TSS  system  concurrently. 
Active  users  may  be  DRUN  (Deferred  command  file)  sessions  or  physical  or 
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* 

A 


paeudo  (at  WIN  sites)  terminals.  Memory  for  UST  entries  is  set  aside 
directly  above  the  executive  code,  and  is  separated  from  the  memory  used  for 
user  subsystems  by  a  movable  fence. 

If  subsystem  accounting  is  active,  one  UST  will  be  used  by  the  SCF 
processes,  and  trill  not  be  available  for  use.  It  is  included,  however,  in 
the  maximum  UST  settings  described  below. 

The  maximum  amount  of  memory  for  USTs  is  computed  based  on  the  maximum 
number  of  terminals  allowed  by  the  executive.  This  parameter  is  a  settable 
option,  and  should  be  computed  by  site  personnel  based  on  the  following 
factors,  as  determined  by  the  reports  produced  by  the  TEARS  system: 

o  Maximum  core  allowances  for  TSS,  from  site  configuration  limitations 
as  determined  from  site  option  patches  (section  14.6.7.3.7) 
o  Average  user  subsystem  sizes  and  usage  rates  as  determined  from 
TEARS  reports  (section  14.6.7) 

o  DRUN  file  usage  rate  which  increases  the  number  of  USTs  required  as 
determined  from  the  usage  of  DRL  T.CFIO  (section  14.6.7) 
o  Occurrence  of  abnormal  disconnects,  timer  setting  (see  below) 

The  number  of  TSS  sessions  (including  DRUN  sessions)  which  will  be  allowed 
concurrently  is  set  through  the  following  site  option  patch. 

CC  CC  CC  CC 

18  1  7 
6  3 

000117  OCTAL  000000000022  18  USTS  MAXIMUM  INCLUDING  DRUN  ( .TFMAX)  .MTIMS 

As  mentioned  above,  this  parameter  will  set  the  maximum  number  of  USTs  which 
can  be  active  within  TSS  at  any  one  time.  One  of  the  factors  which  will 
affect  this  setting  is  the  rate  with  which  users  abnormally  exit  (i.e.  other 
than  BYE)  from  TSS.  The  executive  will  continue  to  hold  the  UST  for  an 
abnormally  exited  user  for  a  period,  and  will  allow  a  user  to  reconnect  to 
this  UST  to  continue  a  previous  session.  The  time  period  for  holding  these 
'Reconnect*  USTs  is  settable,  and  should  be  determined  based  on  site  needs, 
maximum  core  limitations,  and  the  rate  with  which  these  abnormal  disconnects 
occur,  as  determined  from  the  reports  produced  by  the  TEARS  system. 


CC 

CC  CC 

CC 

1 

8  1 

7 

6 

3 

000310 

OCTAL  00003523000D  36/2MIN*6 0*64000  CLK  PLSES  (#A12) 

.MTIMS 

The  other  consideration  which  will  influence  the  number  of  active  USTs 
involves  the  USTs  which  are  used  by  DRUN,  or  deferred  TSS  sessions.  The 
number  of  sessions  which  will  be  handled  by  TSS,  as  well  as  the  number  of 
active  terminals  which  will  prevent  DRUN  files  from  being  executed  can  be 
set.  A  preferred  time  for  the  scheduling  of  DRUN  sessions  can  be 
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established,  lhe  determination  of  values  for  these  parameters  is  dependent 
upon  site  requirements.  The  parameters  should  be  set  based  on: 

o  Site  requirements  and  mission  objectives 

o  Consideration  of  the  active  terminal  loads  faced  by  TSS  weighted  by 
time  of  day 

o  Site  limitations 

The  maximum  number  of  DRUN  sessions  which  are  allowed  at  a  site  is  dependent 
upon  the  usage  of  this  feature.  It  is  set  through  the  use  of  the  following 
site  option  patch: 

CC  CC  CC  CC 

18  1  7 

6  3 

000253  OCTAL  000010000004  8  DRUMS  MAX,  NONE  IF  4  USRS  ACTIVE  (#A12)  . MTIMS 

DRUN  sessions  may  be  scheduled  to  begin  at  a  specific  time  of  day,  or  may  be 
scheduled  with  no  start  time  specified.  If  no  time  is  scheduled,  the 
preferred  time  for  scheduling  DRUN  sessions  is  used.  This  parameter  is  set 
through  the  following  site  option  patch,  and  should  be  computed  based  on  the 
lightest  active  terminal  load  faced  by  TSS  during  the  computing  day,  as 
discovered  through  the  reports  produced  by  the  TEARS  system.  The  default, 
as  shown  below,  is  no  preference. 


CC 

CC  CC 

CC 

1 

8  1 

7 

6 

3 

000255 

OCTAL  000000000000  36/H0UR*6O*MIN*6O*SEC*64OOO 

(#A1A) 

.MTIMS 

Finally,  the  maximum  amount  of  processor  time  which  can  be  used  by  any  one 
DRUN  session  can  be  limited,  thus  increasing  the  throughput  of  these 
sessions,  and  limiting  their  use  of  the  UST  space  which  might  otherwise  be 
used  by  active  terminals. 


CC 

CC  CC 

CC 

1 

8  1 

7 

6 

3 

000254 

OCTAL  000000000000  36/MIN*60*SEC*64000  CLK  PLSES  (#A19) 

•  MTIMS 

GMF/TSSM  traces  record  the  manipulation  of  the  fence  between  UST  memory  and 
subsystem  memory,  and  give  an  indication  of  the  amount  of  space  actually 
being  used,  as  well  as  a  measure  of  how  dynamic  that  load  is.  The  settings 
discussed  in  this  section  should  be  made  such  that  they  do  not  prevent  TSS 
service  to  users,  but  should  be  in  line  with  the  actual  load  which  is 
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expected.  If  500  terminals  are  configured  at  a  site,  but  only  100  are  ever 
seen  at  once,  site  personnel  should  base  their  limitations  on  the  load 
experienced,  not  that  which  is  possible  based  on  the  configuration. 

14.6.7.3.6  Subsystem  Memory  Management.  Of  the  factors  which  can  be  used 
to  improve  TSS  response  time,  those  described  in  this  section  are  the  moat 
effective.  They  control  the  rates  with  which  TSS  will  perform  the  functions 
which  require  the  most  time  of  the  executive.  Unfortunately,  they  are  the 
most  difficult  to  both  understand  and  compute. 

14.6.7.3.6.1  Subsystem  Memory  Allocation  Processes.  Above  the  fence  which 
separates  UST  memory  from  subsystem  memory,  and  below  the  base  address 
limitation  of  TSS,  core  memory  is  used  to  contain  active  user  subsystems. 

In  order  for  a  user  subsystem  to  receive  attention  from  the  processor,  it 
must  be  present  in  this  area.  TSS  maintains  its  own  swap  files,  and  will 
force  subsystems  to  these  files  based  on  the  amount  of  memory  available  and 
the  load  being  experienced. 

Based  on  settable  parameters,  the  TSS  executive  may  request  that  this  user 
subsystem  memory  be  grown  or  shrunk  by  settable  increments,  and  within 
maximum  and  minimum  TSS  size  limitations.  Allocation  routines  within  TSS 
contain  algorithms  which  determine  the  rates  with  which  this  subsystem 
memory  is  allocated,  and  are  programmed  to  discriminate  against  large 
subsystems  to  a  settable  degree.  A  maximum  allowable  'wait  time'  is 
computed  based  on  the  size  of  an  allocation  request.  This  rate  changes  when 
the  attempted  allocation  reaches  a  'large  subsystem  fence’  size,  when  it 
will  become  much  longer.  Once  a  subsystem  has  waited  for  memory  allocation 
for  a  time  greater  than  this  allowable  'wait  time’,  it  will  become  urgent. 
Once  an  urgent  subsystem  has  been  encountered,  the  allocation  processes  will 
limit  allocations  for  any  subsystems  which  are  not  urgent  by  reserving  an 
area  of  memory  for  urgent  subsystems  only. 

The  occurrence  of  those  GMF/TSSM  traces  which  record  the  'urgent  user' 
processes  should  prompt  site  personnel  to  consider  modification  to  one  or 
more  of  the  parameters  described  in  this  section.  The  presence  of  urgent 
users  (i.e.  user  memory  allocations)  is  indicative  of  an  overloaded  TSS 
system. 

After  an  urgent  user  has  waited  for  allocation  for  a  settable  time 
increment,  force  swaps  will  be  performed  against  subsystems  which  have  been 
present  in  core  for  a  separate  minimum  amount  of  time.  Subsystems  which 
become  'force  swapped'  are  considered  first  for  allocation  after  the  urgent 
user  has  been  satisfied.  If  enough  core  is' not  present,  or  cannot  be 
obtained  through  force  swapping,  a  size  increase  may  be  performed  by  TSS 
within  its  maximum  size  limitations  (discussed  in  the  next  section). 

Ironically,  a  subsystem  which  has  been  urgent  longer  than  a  settable 
parameter  will  re-enter  the  allocation  process  from  the  beginning,  and 
allocation  will  return  to  normal.  Urgent  user  processing  lowers  TSS  ability 
to  service  other  allocation  requests,  as  well  as  line  service  and  other 
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executive  functions,  the  purpose  of  this  parameter  is  to  prevent  urgent 
users  from  completely  monopolizing  TSS  executive  processor  time  on  a  heavily 
loaded  system. 

Finally,  a  parameter  may  be  specified  which,  when  exceeded,  will  abort  the 
allocation  for  a  large  subsystem,  returning  the  message  'Not  Enough  Core'  to 
the  requesting  terminal.  The  parameter  may  be  used  to  prevent  or  limit  the 
number  of  times  that  the  urgent  user  coding  is  invoked  by  a  single 
allocation  request. 

14.6.7.3.6.2  Settable  Parameters.  The  following  parameters  are  associated 
with  the  subsystem  memory  management  routines  of  the  TSS  executive.  Few  of 
these  parameters  have  been  established  as  site  option  patches.  The  reader 
is  cautioned  against  gross  modification  of  these  default  values  which  is  not 
both  preceded  and  followed  by  careful  analysis  and  measurement  through  the 
CMF/TSSM  and  TEARS.  The  reader  is  further  warned  that  the  modification  to 
any  one  of  these  parameters  may  affect  the  optimum  settings  for  still 
others.  The  complexity  of  the  interrelationships  between  these  parameters 
has  not  been  approached  in  this  report,  and  should  not  be  underestimated. 
Default  values  used  in  the  examples  shown  in  this  section  can  be  assumed  to 
be  in  balance,  and  will  satisfy^ requirements  at  most  sites. 

The  most  common  modification  made  by  sites  which  are  experiencing  response 
time  problems,  and  the  one  which  will  yield  the  best  results  are  the 
following  patches  which  bias  TSS  against  allocations  for  large  user 
subsystems.  The  patch  below  establishes  the  size  of  a  'large'  subsystem, 
and  the  penalty  (in  allowable  wait  time)  which  is  imposed  against  allocation 
requests  which  are  larger  than  this  size. 
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000200  OCTAL  000044000000  .TAMIS  LG  SS  SIZ  =  36K  (#A0X) 

000207  OCTAL  000000000004  .TALPP  LG  SS  WAIT  MLTPLR  (#A0X) 

By  default,  a  large  subsystem  is  defined  as  being  at  least  36K  words  in 
size,  and  it  will  be  forced  to  wait  four  times  as  long  as  a  normal  subsystem 
before  becoming  urgent.  In  practice,  it  is  recommended  (in  general)  that 
the  definition  of  a  large  subsystem  be  made  based  on  the  actual  sizes  of  the 
subsystems  used  at  a  site,  computed  based  on  twice  the  average  of  their 
sizes  weighted  against  the  usage  these  subsystems  receive.  The  wait  time 
penalty  should  be  raised  if  these  subsystems  become  urgent  at  times  other 
than  the  maximum  loading  periods  experienced  at  the  site. 

The  time  period  before  a  subsystem  memory  allocation  is  aborted  can  be 
modified  through  the  following  site  option  patch.  This  time  period  should 
not  prevent  a  subsystem  from  becoming  urgent,  and  should  be  larger  than  the 
maximum  wait  time  for  the  largest  subsystem  at  the  site.  It  shoultf  include 
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a  margin  to  allow  TSS  to  attempt  to  process  the  subsystem,  once  overdue. 
Probable  successful  values  for  this  parameter  will  be  larger  than  the 
maximum  urgent  user  processing  time  (described  later  in  this  section)  plus 
the  time  required  to  attempt  TSS  memory  expansion.  The  value  should  be 
lower  than  some  multiple  of  this  sum,  to  prevent  an  allocation  from  becoming 
urgent  more  than  a  few  times. 
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000165 

OCTAL 

000007640000  36/30SEC*64000  DEFAULT  MAX  WAIT  TME  (#A0W) 

.MTIMS 

The  default  value  should  be  modified  only  if  the  reports  produced  by  TEARS 
indicate  that  allocations  for  large  subsystems  are  becoming  urgent  several 
times  before  being  satisfied. 


The  maximum  amount  of  time  which  can  be  used  to  attempt  urgent  user 
allocations  can  be  limited  by  using  the  following  patch.  This  tunable 
parameter,  as  with  all  following  it  in  this  section,  has  not  been 
established  as  a  site  option  patch.  The  reader  is  advised  to  report  any 
modifications  made  to  these  cells  when  reporting  TSS  incidents  for  analysis. 
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The  maximum  allowable  wait  time  (for  all  but  large  subsystems)  is  computed 
based  on  a  constant  function  of  the  size  of  the  subsystem  being  allocated. 

The  constant  used  can  be  modified  to  either  increase  or  decrease  the  amount 
of  time  which  can  be  spent  by  a  subsystem  waiting  non-urgent  allocation. 
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000210  OCTAL  0000030  OCT  30  *=  1/6  SEC  PER  K  WAIT  TIME  DVSR  ( .TASWF)  .MTIMS 

The  reader  should  observe  that  lowering  the  value  of  .TASWF  will  increase 
the  amount  of  allowable  wait  time,  while  raising  it  will  decrease  wait 
time.  Lowering  the  wait  time  will  raise  the  occurrence  of  urgent 
allocations,  while  raising  it  will  decrease  urgent  allocations  (but  will 
increase  average  allocation  times  allowed  before  a  subsystem  becomes 
uigent).  Urgent  allocation  processes  may  be  required  to  load  large 
subsystems,  or  to  force  processing -bound  subsystems  to  swap. 

The  minimum  core  resident  time  can  be  specified  with  the  patch  shown  below. 
Subsystems  which  have  been  resident  for  less  time  than  the  value  of  this 
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cell  will  not  be  considered  for  force  swap.  The  purpose  of  this  cell  is  to 
prevent  'thrashing'  or  the  constant  moving  of  active  subsystems  between  swap 
files  and  core.  Of  particular  interest  when  computing  this  value  is  the 
TEARS  reports  which  describe  the  time  spent  by  a  subsystem  waiting  for  and 
receiving  processor  attention.  The  values  of  this  cell  should  be  high 
enough  to  permit  subsystems  to  receive  processor  attention. 
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OCTAL  000001161000  36/5SEC#64000  CLK  PLSES  MEM  RES  TIM 

(A.MTQ)  .MTIMS 

The  amount  of  subsystem  memory  which  can  be  reserved  for  urgent  user 
allocations  can  be  limited  based  on  a  fraction  of  the  total  memory  available 
for  subsystems.  The  actual  amount  reserved  will  be  equal  to  the  size  of  the 
allocation  for  the  largest  urgent  user.  Modification  to  this  cell  may  force 
TSS  to  perform  a  GEMORE  of  memory  prior  to  attempting  allocation  of  a 
subsystem.  If  the  urgent  allocation  (i.e.  memory  size  for  the  allocation 
which  has  been  waiting  the  longest)  is  larger  than  the  fraction: 

TSS  Subsystem  Memory  Ar$a/A.URMD 

Then  the  urgent  user  fence  cannot  be  established,  and  a  GEMORE  will  be 
forced  before  the  allocation  can  be  satisfied. 
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000211  OCTAL  000000000002  DEC  2  -  MAX  URG  SIZ  50 %  SS  CORE  (A.URMD)  .MTIMS 

The  analyses  required  to  compute  the  value  of  this  cell  are  based  on  the 
average  size  of  the  TSS  executive,  the  number  of  users  (thus  the  location  of 
the  UST/SS  memory  fence,  and  the  amount  of  subsystem  memory  available),  the 
average  loading  of  this  subsystem  memory  (yielding  the  amount  likely  to  be  . 
available  and  the  number  of  allocations  which  are  likely  to  be  pending),  and 
the  size  of  those  subsystems  which  become  urgent  most  often.  If  the 
fraction  of  the  size  of  these  subsystems  /  the  available  subsystem  memory 
size  is  greater  than  5<#  of  the  total  (default),  then  the  urgent  user  memory 
fence  cannot  be  established,  forcing  a  GEMORE.  If  the  sizes  of  the  urgent 
user  subsystems  cannot  be  lowered,  some  adjustment  is  indicated. 

Force  swaps  will  be  performed  to  allow  an  urgent  subsystem  allocation  to  be 
performed  within  the  memory  fence  established.  The  minimum  amount  of  time 
which  must  expire  before  TSS  will  consider  GEMOREs  for  core  size  increases 
to  satisfy  the  allocation  are  settable  based  on  the  cell  shown  below. 
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000161  OCTAL  0000175000  36/lSEC*64000  CLK  PLSES  B?R  SIZ  INC  (.TASID)  .MTIMS 

The  minimum  interval  between  attempts  to  increase  the  size  of  TSS  because  of 
unsatisfied  urgent  allocation  requests  is  settable  based  on  the  following 
cell. 


000162  OCTAL  000003523000  36/5SEC*6400  CLK  PLSES  BTVN  (MORE  ( .TASCP) 
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Based  on  the  analyses  described  above,  it  may  be  desirable  to  decrease  the 
minimum  time  which  must  expire  before  a  size  increase  can  be  considered  for 
large  urgent  allocations.  Based  on  the  failure  rate  for  GEMORE  requests  (as 
determined  from  GMF/TSSM  traces,  and  caused  by  a  heavy  non- TSS  core  load), 
the  interval  between  these  requests  might  also  be  raised. 

The  maximum  amount  of  time  which  will  be  spent  by  TSS  in  the  attempt  for  a 
single  GEMORE  request  can  be  controlled,  to  limit  the  performance 
degradation  faced  by  TSS  users  in  situations  where  growth  is  not  possible. 
Reasons  why  this  might  occur  include  programs  marked  'dead'  directly  above 
TSS  in  core,  and  are  usually  transitory  in  nature.  Should  response  times  be 
affected  by  the  length  of  time  spent  by  TSS  attempting  to  GEMORE  memory, 
this  cell  might  be  changed  to  a  lower  value. 
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000163  OCTAL  000016514000  36/60*SEC*6400  CLK  PLSES  GMOR  LMT  ( .TATMC)  .MTIMS 

The  interval  between  GEMORE  attempts  to  expand  TSS  memory  can  be  controlled, 
preventing  TSS  from  attempting  to  grow  too  fast,  and  limiting  the  time  spent 
in  considering  size  changes.  The  following  cell,  in  conjunction  with  .TAMRI 
(discussed  below)  will  limit  the  rate  with  which  TSS  will  change  size. 

CC  CC  CC  CC 
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000164  OCTAL  002342000  36/lO*SEC*64000  CLK  PLSES  (MR  LMT  ATT(.TAGMI)  .MTIMS 
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As  with  size  increase  considerations,  the  intervals  between  size  reduction 
considerations  and  attempts  can  also  be  controlled.  If  the  intervals 
described  above  for  size  increases  are  shortened,  permitting  rapid 
acquisition  of  space  by  TSS,  then  consideration  should  also  be  given  to 
making  similar  changes  to  the  cells  described  below,  permitting  rapid  return 
of  that  space  to  the  system  possible  also.  If  corresponding  changes  are  not 
made  to  this  logic,  TSS  will  monopolize  core  memory  longer  than  required  to 
satisfy  its  subsystem  requirements. 

The  minimum  interval  between  size  reduction  considerations  can  be  controlled 
based  on  the  usage  of  the  TSS  swap  files.  If  the  swap  file  usage  exceeds 
the  value  of  the  cell  shown  below,  consideration  of  size  reductions  will  be 
prevented.  Situations  may  arise,  and  may  be  reported  by  TEARS,  where  TSS 
appears  to  be  too  large,  but  swap  file  usage  (as  indicated  by  OtP/TSSM 
traces)  will  show  that  numerous  swap-ins  are  in  progress. 
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000206  OCTAL  0000000113  DEC  75*  SVP  PIL  UTL  PREVNT  RDCTN  (.TAFMR)  .MTIMS 

If  swap  file  usage  is  not  in  excess  of  .TAMPR,  the  cell  shown  below  will 
limit  the  number  of  times  that  a  TSS  size  reduction  will  be  considered. 
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000157  OCTAL  0035230000  36/2MIN*60*6400  CLK  PLSES  BTVN  RDCTN  .TAMRI  .MTIMS 

After  the  need  for  a  TSS  size  reduction  is  identified,  a  settable  delay  will 
ensue  before  it  is  actually  performed.  This  delay  will  prevent  rapid  core 
releases  from  being  scheduled  (which  would  likely  be  followed  by  rapid  core 
requests).  This  damper  acts  to  synchronize  the  clocks  which  limit 
considerations  of  TSS  size  reductions  and  size  increases. 
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According  to  documentation  and  program  comments,  successful  values  for  this 
cell  should  be  less  than  the  sum  of  .TATMC  (maximum  size  change  wait 
interval)  plus  two  times  .TAGMI  (minimum  interval  between  GEMORE  requests). 

The  final  settable  parameter  which  relates  to  the  user  subsystem  memory  area 
concerns  not  user  subsystems  but  rather  VIP  terminal  buffers.  The  Buffered 
Terminal  Output  System  (BTOS)  will  allocate  Extended  Memory  Buffers  (EMBs) 
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in  the  subsystem  memory  area  to  hold  terminal  output  prior  to  its  being 
transmitted  to  terminals.  The  site  option  patch  shown  below  can  be  used  to 
limit  the  number  of  VIP  terminals  which  can  be  active  (on-line)  at  any  one 
time,  and  thus  the  number  of  EMBs  which  might  be  active. 
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14.6.7.3.7  TSS  Executive  Processing.  The  site  option  patches  described  in 
this  section  relate  to  the  TSS  executive  performance,  and  will  affect  the 
time  required  to  perform  many  of  the  functions  related  to  TSS  response. 


TSS  maintains  its  own  Peripheral  Allocation  Tables  (PATs)  which  define  the 
location  and  size  of  the  files  accessed  by  the  executive,  as  well  as  those 
accessed  by  each  time  sharing  user  active  in  the  system.  If  analysis  of  the 
GMF/TSSM  data  collected  during  a  monitoring  session  reveals  the  presence  of 
trace  type  3s,  then  the  amount  of  space  available  to  TSS  to  maintain  these 
PATs  is  insufficient.  PAT  tables  are  maintained  by  TSS  in  its  Slave  Service 
Areas  (SSAs).  The  number  of  SSAs  required  is  computed  by  TSS  based  on  the 
expected  size  of  a  PAT  entry  times  the  number  of  entries  expected  to  service 
the  maximum  number  of  TSS  users  which  are  allowed.  There  are  two  ways  in 
which  this  computation  may  need  to  be  overridden.  First,  the  number  (or 
size)  of  the  files  used  by  TSS  users  may  be  greater  than  that  expected  by 
TSS.  Second,  disk  space  fragmentation  might  expand  the  size  of  the  PAT 
entries  required  to  define  the  physical  locations  of  the  files  accessed  by 
TSS  users.  When  GMF/TSSM  trace  type  3s  are  encountered,  the  number  of  SSAs 
reserved  by  TSS  (and  hence,  the  space  available  for  PAT  entries)  may  be 
modified  through  the  use  of  the  following  patch. 
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000000000000  0= COMPUTE  BASED  ON  MAX  USERS 

.TSSA 
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The  minimum  interval  between  entries  to  the  Line  Service  state  (section 
14.6.7.1)  to  process  normal  terminal  I/O  requests,  terminations,  and  status 
checks  is  controlled  by  the  following  non-site  option  parameter.  If 
analysis  of  the  GMF/TSSM  data  tapes  produced  during  monitoring  sessions 
reveal  that  the  line  service  state  is  being  entered  without  any  need,  then 
this  interval  may  be  raised.  If  analysis  reveals  that  the  volume  of  work 
faced  by  this  remote  I/O  module  is  high,  the  interval  may  be  lowered  to 
improve  response. 
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Periodic  service  performed  by  the  TSS  executive  Line  Service  nodule  is 
perforated  according  to  the  setting  of  the  parameter  shown  below.  Periodic 
service  functions  include  noticing  of  terminal  disconnects  and  new  log-ons. 
While  not  a  significant  element  of  terminal  response  time,  this  factor  might 
be  raised  if  analysis  reveals  that  the  number  of  terminal  connects  and 
disconnects  is  very  low.  The  possible  savings  in  terms  of  processing  time 
are  not  high,  but  might  be  worth  considering  on  a  heavily  loaded  system. 
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Finally,  the  maximum  size,  minimum  size,  amount  to  obtain,  and  amount  to 
release  for  core  storage  can  be  specified  by  the  following  site  option 
patch.  Minimum  size  requirements  are  those  which  will  be  sufficient  to 
satisfy  the  load  experienced  by  TSS  during  periods  of  off-peak  use.  The 
difference  between  heavy  and  light  use  must  be  defined  by  each  site.  During 
periods  of  low  (off-peak)  use,  TSS  should  not  be  forced  to  GEMORE  core.  The 
amount  of  core  required  by  TSS  at  the  heaviest  point  in  its  processing  day 
should  be  measured  through  the  "tJMP/TSSM  and  used  as  the  maximum  size. 
Increments  to  GEMORE  and  release  should  be  developed  based  on  the  frequency 
with  which  these  functions  are  performed.  Consideration  should  be  given  to 
the  sizes  of  the  subsystems  which  most  commonly  cause  size  increases  to 
occur.  The  frequency  with  which  these  subsystems  are  used  and  the  amount  of 
core  required  to  run  them  will  indicate  the  amount  of  core  which  could  be 
GEMOREd  by  the  TSS  executive.  In  general,  GEMORE  and  release  core 
increments  are  made  based  on  the  weighted  average  core  requirements  which 
cause  size  increases  to  be  considered.  The  parameter  for  core  release 
should  be  an  even  multiple  of  that  used  for  GEMORE,  for  obvious  reasons. 


CC  CC  CC 

1  8  1 

6 

000176  OCTAL  000000000144 
000177  OCTAL  170000000000 
000201  OCTAL  000000000024 
000202  OCTAL  024000000000 
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.TAMMS  MAX  TSS  CORE  IN  K 
.TASMS  MIN  TSS  CORE  IN  K 
.TAMII  MEM  ADD  INCREMENT 
.TASRI  MEM  REL  INCREMENT 

es  of  these  parameters  is 
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follows. 

.TAMMS  is  the  octal  number  of  K  words  which  limits  the  maximum  size  of  the 


executive.  Bits  0-17  of  .TASMS  control  the  minimum  size  of  the  TSS 
executive  in  words  (not  K  words).  The  altered  value  of  this  variable  is 
equivalent  to  45K,  and  is  considered  the  minimum  reasonable  number.  .TAMII 
is  the  memory  addition  increment  in  K  words.  .TASRI,  bits  0-17  are  the 
number  of  words  (not  K  words)  which  should  be  used  for  memory  releases. 
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14.6.7.3.8  Subdiapatch  Queue  Processing.  With  WWMCCS  software  release 
W7.2B  (commercial  4JS ) ,  a  change  was  made  to  the  processes  used  to  grant 
processor  attention  to  active  user  subsystems.  The  system  dispatching 
module,  .MDISP,  was  modified  to  process  a  subdispatch  queue  for  both  TSS  and 
Transaction  Driven  System  (TDS).  The  purpose  of  these  subdispatch  queues  is 
to  grant  privleged  attention  to  those  portions  of  the  system  which  are 
interactive,  or  which  will  affect  the  response  times  experienced  by  terminal 
users. 

There  are  several  parameters  which  can  be  adjusted  to  affect  the  degree  to 
which  these  subdispatch  queues  will  receive  privleged  attention  from  GCOS. 

An  understanding  of  the  processes  involved  is  important  before  any 
adjustment  is  attempted. 

The  first  of  the  three  subdispatch  queues  used  by  TSS  is  the  Available 
queue,  which  is  a  linked  list  of  all  queue  entries  which  are  not  currently 
linked  into  either  the  Ready  or  Fault  queues.  When  a  subsystem  has  been 
loaded  into  the  Subsystem  memory  area,  an  entry  is  obtained  from  this 
available  queue,  loaded  with  the  address  of  the  subsystem,  and  linked  into 
the  Ready  queue.  Once  placed  in  this  Ready  queue,  no  further  GMF/TSSM 
traces  for  this  subsystem  (and  hence,  the  user)  will  be  encountered  until 
the  subsystem  has  received  attention  from  the  processor. 

.MDISP,  during  its  processing,  will  remove  an  entry  from  the  Sub-dispatch 
Ready  queue  when  selecting  a  new  slave  process  for  dispatch.  Tunable 
parameters  exist  which  can  be  used  to  adjust  how  often  this  selection  will 
be  made  from  the  subdispatch  queue  as  opposed  to  normal  batch  slaves. 

Once  a  subsystem  has  been  dispatched  to,  some  fault  will  occur  to  end  the 
dispatch.  For  most  subsystems,  this  will  be  a  DRL  fault  signifying  either 
an  I/O  request  or  subsystem  termination.  In  certain  cases,  the  fault  may  be 
a  timer  runout,  but  this  will  occur  for  only  a  small  percentage  of 
processor-bound  subsystems.  When  this  fault  occurs,  .MDISP  will  place  the 
subsystem  address  in  an  entry  to  the  Subdispatch  Fault  queue.  During 
executive  processing,  TSS  will  retrieve  this  entry  from  the  Fault  queue,  and 
act  to  resolve  the  fault  encountered.  Subsystems  which  have  received  timer 
runout  faults  will  be  placed  back  in  the  Ready  queue  for  further  processor 
attention.  DRL  faults  must  first  be  processed  before  the  subsystem  can 
become  eligible  for  processor  attention  again. 

Investigation  has  shown  that,  in  certain  circumstances,  subsystems  which  are 
heavily  processor-bound  can  degrade  system  performance  by  monopolizing  the 
available  processor  attention.  While  this  will  not  greatly  affect  other 
user  subsystems  (Since  the  Ready  queue  is  processed  sequentially),  it  will 
limit  the  attention  which  will  be  given  to  other  batch  jobs,  because  of  the 
special  attention  given  to  subdispatches.  Log  number  #BNF  has  been 
developed  for  release  W7.3.0  to  minimize  this  potential  problem  by  limiting 
the  number  of  consecutive  subdispatches  (without  an  intervening  I/O  or  other 
fault)  a  subsystem  can  receive  before  a  delay  is  forced  for  that  subsystem. 
Before  its  release,  this  patch  may  become  a  site  option. 
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Current  feelings  are  that  there  are  other  tunable  parameters  which  can  also 
be  used  to  limit  this  problem. 

Within  TSS,  the  following  tunable  parameters  can  be  used  to  affect 
sub-dispatch  queue  processing.  The  patch  below  can  be  used  to  affect  the 
duration  of  a  processor  time-slice  which  will  be  granted  to  a  subsystem 
which  has  been  placed  in  the  Ready  queue. 


CC 

CC  CC 

CC 

1 

8  1 

7 

6 

3 

002630 

OCTAL  000000002400 

36/20*64000  CLK  PLSES 

(.QQTM) 

.MTIMS 

Also  within  TSS,  the  patch  below  can  be  used  to  affect  the  threshold  of 
Ready  queue  entries  which  will  force  the  executive  to  perform  certain 
actions  during  its  own  dispatch. 


CC 

CC  CC 

1 

8  1 

6 

002641 

OCTAL  000012000004 

CC 

7 

3 

RDY  QUE  LMT,  THRESHOLD  ( .QRCL)  .MTIMS 


.QRCL  is  used  by  TSS  in  the  following  manner  during  an  executive  dispatch: 

o  IP  .QRCL  (18-35)  is  greater  than  the  number  of  entries  which  are  in 

the  Ready  queue,  TSS  will  relinquish  to  allow  subsystem  dispatching. 

o  IF  .QRCL  (18-35)  is  less  than  the  number  of  entries  waiting  in  the 

Ready  queue,  and  if  both  are  lower  than  the  limit  in  .QRCL  (0-17), 

the  executive  will  first  process  Fault  queue  entries  before 
relinquishing  to  allow  subsystem  dispatching. 


o  IF  the  number  of  entries  in  the  Ready  queue  is  greater  than  .QRCL 
(0-17),  then  the  executive  will  enter  the  Line  Processing  state  to 
handle  any  outstanding  remote  I/O  requests. 


Modification  to  .QRCL  will  affect  the  speed  with  which  TSS  will  perform  its 
executive  functions,  thus  affecting  subsystem  allocation  and  fault 
processing.  By  raising  the  value  of  .QRCL  (0-17),  the  number  of  times  that 
TSS  will  attempt  to  service  both  outstanding  faults  and  remote  I/O  requests 
will  decrease.  By  raising  the  value  of  .QRCL  (18-35),  subsystem  processor 
attention  will  be  increased,  but  the  number  of.  times  that  the  Sub-dispatch 
Fault  queue  will  be  processed  will  be  decreased.  It  is  suggested  that, 
before  attempting  to  adjust  .QRCL,  site  personnel  first  attempt  to  improve 
response  time  by  adjusting  the  length  of  both  an  executive  dispatch  (to 
decrease  fault  queue  wait  time  by  allowing  TSS  to  process  all  outstanding 
faults  during  a  single  dispatch),  as  well  as  .QQTM  (to  allow  subsystems  a 
longer  processor  time  slice). 


Within  the  system  dispatching  module,  .MDISP,  the  decision  as  to  which 
process  will  receive  the  next  dispatch  is  made  based  on  two  dampers. 

Altered  values  will  allow  six  subdispatches  to  occur  for  every  two  normal 
batch  dispatches.  The  patch  shown  below  can  be  used  to  affect  this  balance. 


CC  CC  CC  CC 

18  1  7 

6  3 

001305  OCTAL  000000000006  QDCT  SUB-DSP  COUNTER  RFSH  .MDISP 

001307  OCTAL  000000000002  QNDR  BATCH  COUNTER  RPSH  .MDISP 


Consecutive  dispatches  are  made  by  .MDISP  until  one  counter  has  been 
emptied,  or  until  no  further  (subdispatch  or  batch)  processes  are  waiting, 
before  the  other  type  of  process  is  given  attention.  Thus,  six 
sub-dispatches  will  be  made,  followed  by  two  normal  batch  dispatches  (One  of 
will  probably  be  to  the  TSS  executive  if  Priority  'B'  processing  is  active), 
given  the  default  values  shown.  As  with  the  threshold  to  .MTIMS  shown 
above,  it  is  suggested  that  site  personnel  first  attempt  to  affect 
performance  by  varing  the  length  of  a  processor  dispatch  before  attempting 
to  modify  the  ways  in  which  dispatches  are  granted. 

14.6.7.4  Identification  of  TSS  Response  Time  Problems.  The  performance 
information  gathered  by  the  GMF/TSSM  can  be  used  by  site  personnel  to 
identify  the  frequency  of  occurrence  of  response  time  problems.  Delays 
which  occur  can  be  caused  by  any  number  of  factors.  These  factors  can  be 
either  related  or  unrelated  to  user  actions.  This  section  describes  the 
ways  in  which  the  reports  produced  by  TEARS  can  be  used  to  identify  the 
occurrence  of  these  response  time  problems. 

14.6.7.4.1  General.  Normally,  site  personnel  will  first  encounter  a 
response  time  problem  as  the  result  of  user  complaints.  Once  identified  by 
the  user,  the  GMF/TSSM  can  be  executed  over  a  period  of  days  (or  weeks) 
before  the  particular  problem  noticed  by  users  is  encountered  again. 
Typically,  a  user  is  unable  to  reproduce  a  problem  at  will,  as  most  are 
related  to  either  system  loads  or  some  other  uncontrollable  circumstance. 

If  site  personnel  have  incorporated  the  suggestions  contained  in  Section 
14.6.7.3  of  this  report,  many  of  the  problems  which  are  typically 
encountered  can  be  minimized  or  prevented. 

In  order  to  identify  the  existence  of  a  problem,  (if,  in  fact,  any  problem 
exists)  site  personnel  must  first  attempt  to  categorize  the  nature  of  the 
complaint.  The  answers  to  the  following  questions  are  of  paramount 
importance  in  this  categorization: 

o  What  was  the  user  doing?  What  subsystem  was  he  attempting  to  run, 
how  did  the  delay  manifest  itself? 

o  How  long  was  the  user  delayed? 
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o  How  many  users  were  affected?  Were  they  all  trying  to  do  the  same 
(type  of)  thing? 

o  What  USERID  was  being  used?  What  terminal  ID  (very  important  in  the 
case  of  shared  accounts)? 

o  During  what  time  of  day  did  (does)  the  problem  occur  ? 

o  Can  the  problem  be  recreated,  or  forced  to  occur  at  will? 

Without  answers  to  at  least  most  of  these  questions,  user  problems  with 
terminal  response  time  will  be  difficult  at  best  to  even  identify.  The 
volume  of  information  collected  by  the  GMF/TSSM  will  almost  preclude 
detailed  analysis  of  all  of  the  trace  information  collected  over  a  normal 
processing  day. 

If  the  nature  of  the  complaint  or  problem  experienced  seems  to  indicate  a 
continuing  or  worsening  situation,  site  personnel  should  execute  the 
GMF/TSSM  to  collect  trace  information.  The  GMF/TSSM  collector  should  be 
executed  several  times,  under  varing  loads,  and  during  different  times  of 
day,  over  several  days  before  attempting  to  reduce  and  analyse  the  collected 
data.  In  this  way,  the  site  analyst  can  be  better  assured  that  he  }  * 
collected  sufficient  information  for  resolution.  Whether  attempting  to 
analyse  a  specific  problem,  or  to  optimize  performance  for  all  ut  s,  the 
collection  of  trace  data  over  a  period,  and  under  varying  loads,  j 
important  to  allow  for  an  understanding  of  the  usage  profile  at  a  specific 
site. 

Once  sufficient  GMF/TSSM  trace  data  has  been  collected.  Site  personnel 
should  begin  the  process  of  response  problem  identification  with  the  TEARS 
RESPONSE  phase  of  data  reduction.  The  reader  is  referred  to  the  GMF  Users 
Manual  for  direction  concerning  the  execution  of  TEARS. 

14.6.7.4.2  Analysis  of  TEARS  RESPONSE  Phase  Reports.  In  this  section,  each 
of  the  reports  produced  by  the  RESPONSE  phase  of  the  TEARS  data  reduction 
system  is  examined,  and  some  direction  as  to  the  type  of  problem  each  can 
indicate  is  described. 

14.6.7.4.2.1  Timesharing  Reduction  Event  Log.  The  first  report  produced  by 
the  RESPONSE  phase  of  the  TEARS  system  is  the  event  log.  Figure  14-15 
illustrates  this  report.  The  importance  of  the  log  lies  in  rts  use  to 
identify  the  location  within  the  data  tapes  of  significant  events  during  a 
tracing  session.  As  described  in  section  14.6.7.3.1,  it  would  be  much 
easier  if  only  a  portion  of  the  trace  information  had  to  be  analysed.  This 
report  should  be  used  to  identify  the  portion  by  both  record  number  and  time 
of  day.  For  example,  if  a  TRAX-EXEC  user  on  terminal  DA  reports  a  problem, 
then  only  that  portion  of  the  GMF/TSSM  data  tape  from  record  number  737626 
to  1051854  would  be  of  interest.  This  event  log  should  be  retained  as  a 
permanent  index  to  any  GMF/TSSM  data  tape  which  is  used  for  further ‘analysis. 
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Figure  14-15.  Time  Sharing  Reduction  Event  Log  Report 
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14.6.7.4.2.2  Response  Times  for  all  Users.  This  is  the  first  plot  report 
produced  by  TEARS.  The  report  provides  the  analyst  with  two  very  important 
types  of  information.  First,  an  indication  of  the  average  response  times 
being  experienced  is  directly  displayed.  Second,  anomalies  to  this  average 
can  be  first  discovered  and  noted.  Figure  14-16  contains  an  example  of  this 
type  of  report.  Note  the  average  response  times  which  have  been  marked  with 
a  pen.  Compare  how  this  average  rises  with  the  number  of  active  terminals 
(left  hand  column) .  The  LINE  column  displays  the  user  who  experienced  the 
longest  delay  during  the  time  interval  shown  on  each  line  of  the  report. 

The  example  shown  in  figure  14-16  shows  that  between  3:00  and  3:01,  the 
minimum  response  time  experienced  suddenly  jumped  to  over  30  seconds.  This 
type  of  jump  or  difference  should  be  noted,  and  the  time  of  day  should  be 
compared  with  other  reports. 

This  first  plot  cannot  reveal  any  more  than  the  fact  that  the  response  time 
was  different  during  some  interval.  No  indication  of  the  source  of  this 
difference  can  be  found  here. 

The  user  of  TEARS  is  advised  that  the  trace  information  shown  at  the 
beginning  and  ending  of  the  tracing  period  is  skewed  due  to  the  fact  that 
incomplete  intervals  are  shown.  For  example,  the  GMF/TSSM  mcy  record  the 
end  of  a  response  interval  just  after  the  beginning  of  the  tracing  period, 
but  this  value  will  be  skewed  downward  because  TEARS  must  assume  that  the 
response  interval  began  just  before  the  beginning  of  trace  data  collection. 

A  similar  situation  exists  at  the  end  of  a  tracing  period,  when  the 
beginning  of  an  interval  is  noted,  but  tracing  is  shut  off  before  the 
response  interval  completes. 

The  validity  of  this  report,  however,  is  comprimized  by  the  fact  that  it 
actually  displays  wait  times,  not  response  times.  In  the  example  Bhown, 
terminal  TM  is  executing  a  BASIC  program  which  is  trapped  in  an  endless 
loop.  Thus,  no  responses  are  given  to  terminal  TM  at  all  between  2:41  and 
3:33,  but  the  response  time  experienced  by  that  terminal  seems  to  be 
steadily  degrading.  Because  of  this  significant  difference  in  definition, 
terminal  TM  will  mask  any  response  problems  for  terminals  which  were  truly 
interactive  during  this  interval. 

This  problem  is  shared  by  the  next  two  RESPONSE  TIMES  reports  described 
below. 


14.6.7.4.2.3  Response  Times  For  User  Not  Requesting  More  Core.  The  example 
shown  in  figure  14-17  is  a  subset  of  the  information  shown  on  the  All  USERS 
report  described  above.  The  difference  is  the  fact  that  this  report  shows 
response  times  for  users  who  have  not  requested  core  during  the  time  frames 
shown.  Vfhat  this  means  is  that  they  are  within  a  subsystem,  and  have  not 
been  swapped  by  TSS.  Again,  an  anomaly  exists  at  3:01,  with  terminal  TQ 
experiencing  the  most  delay. 
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Figure  14-16.  Response  Times  For  All  Users  Report  (Part  1  of  2) 


CH-7 


>30 


s  ^  »  e  ^  r  * 


^  «#  «  r*  ^  C  ** 

v-  «  »TM/»  ^ 


********lA«**'»**'«''•*•','•''‘*'•r'•^•l'* 

»  »  p>' »  e*’  pV  ►  >  ►  >  ^  ^  p  ^  5 


ccceccceece  ocec 


e 


fw  * 
O  « 


«  o  •*© 
c  •  •  • 

OKW 


o  ©  e 

r  *  •  * 

9  ui  ww 
*  o^fr 
-o*-» 


•  t  «h 

»  -  >  ; 

Vi  <  t 


Figure  14-16*  (P*rt  2  of  2) 
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The  information  in  this  report  is  best  used  in  comparison  to  the  next  report 
below  for  users  who  do  have  outstanding  core  requests.  The  magnitude  of 
the  differences  in  response  times  shown  during  common  times  for  these 
reports  is  an  indication  of  the  times  required  to  perform  memory  allocation 
and  loading.  Comparison  of  these  two  reports  to  the  first  ALL  USERS  report 
should  (but  does  not  well)  indicate  the  effect  of  memory  allocation,  user 
subsystem  swapping,  and  subsystem  I/O  on  response  times. 

14.6.7.4.2.4  Response  Times  For  Users  With  Core  Request  During  Line  Idle. 
This  report  is  show  in  figure  14-18.  This  third  plot  is  the  second  subset 
of  the  overall  response  time  indications  given  by  the  ALL  USERS  response 
times  report  described  in  section  14.6.7.4.2.2.  This  subset  (of  the  overall 
response  times  shown  in  the  first  RESPONSE  report)  is  based  on  those  users 
(subsystems)  which  required  a  memory  allocation  during  the  response  time 
interval  shown.  A  memory  allocation  is  made  for  users  who  are  either 
invoking  a  new  subsystem  (i.e.  CARD  N)  or  whose  subsystem  has  been  swapped 
and  is  now  being  unswapped.  The  response  times  noted  in  this  report  will 
always  be  higher  than  those  in  the  ‘NO  CORE  REQUEST'  plot,  and  may  in  fact, 
reveal  the  sources  of  anomalies  noted  in  the  'ALL  USERS'  plot. 

Note  the  anomaly  discovered  earlier  at  time  3:01,  which  also  appears  in  this 
plot.  It  seems  that  the  major  element  (in  the  'ALL  USERS'  plot)  for  this 
delay  concerned  terminal  TM,  who  was  attempting  either  a  subsystem  load  or 
an  unswap.  Indeed,  in  the  example  shown,  terminal  TM  is  shown  each  time  the 
minimum  response  time  is  high.  Either  the  subsystem  being  used  by  this 
terminal  is  very  large  (as  was  the  case)  or  terminal  TM  is  processor  bound 
thus  becoming  a  swap  candidate  (also  the  case)  or  terminal  TM  is  attempting 
to  load  a  very  large  subsystem.  If  this  were  an  initial  subsystem  load, 
some  indication  could  be  gained  from  several  of  the  reports  described  below. 

14.6.7.4.2.5  Total  Time  In  Subdispatch  Queue  Report.  The  SUBDISPATCH 
reports  described  in  the  following  sections  can  be  used  to  identify  the 
nature  of  the  work  being  performed  to  TSS  subsystems  during  the  tracing 
period.  Of  particular  interest  in  the  report  shown  in  figure  14-15  (which 
is  repeated  in  each  of  the  next  two  reports)  is  the  column  labeled  TROUT. 
TROUT  is  a  percentage  of  the  subdispatches  which  occurred  during  the  time 
frame  just  ended  which  ended  with  a  timer  runout  fault  (as  opposed  to  a  DRL 
or  some  other  fault).  In  the  example  shown,  it  appears  that  all  users  are 
processor  bound,  and  could  be  greatly  affected  by  some  increase  in  the 
amount  of  processor  time  per  sub-dispatch  (see  section  14.6.7.3.9).  The  BSY 
column  provides  a  vague  measure  of  how  busy  the  line  service  module  of  TSS 
has  become  during  the  interval.  The  example  shown  indicates  that  the 
subsystems  in  execution  do  not  perform  much  I/O.  CPU  is  a  measure  of  the 
queue  processing  overhead  faced  by  .MDISP,  but  since  this  cannot  be  affected 
by  site  personnel,  its  purpose  remains  unclear. 


14-78 


CH-7 


*iioi  i«i  >u9  iiiiim  *  n-zo-iB  mi  i4i'6iwwu  i*  msa  bitws  no  oiunto)  Nounomm* 


Mltllt  ‘HOIDMII  ISl 


.  V  vv  V  |  ■*,  s  ' 
»  VSNN  I  I  • 

I  I  I  **  I  I  I 

I  I  I  I  I  I  « 

•  ••III* 

I  I  I  I  I  I  * 

«  t  I  I  I  I  ( 

I  I  I  I  I  t  I 

I  t  I  I  I  I  I 

I  I  «  I  I  I  I 

I  «  I  I  »  I  » 

I  I  «  I  I  «  • 

I  •  I  t  *  I  * 


AAAAAAAAAA^AaAAAAAAAAAAAAAAi 

^j^ctKtceeeeceKCCesseKKoeta! 

VVVVVVVVVVVVVVVWVVVVVVVVVV' 


NK  •  K  i  fe  e  a  K  V  •  C  n 


v  v  v  v  v  v  v 


e»Ofr*-»»oc^'*c<A*»v*«0*«iAC»vc»w«*'«»c<c-o^CN*>'aK«»<K 


)in-*-40ir.OCCOOOCOOCCOOo^C^OO^C'««‘CA.-NDwJ 


3  k 


Figure  14-19.  Total  Time  In  Subdispatch  Queue 
Report  (Part  1  of  2) 
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This  first  SUBDISPATCH  report  is  the  sum  of  the  next  two  reports.  As  with 
the  RESPONSE  TIME  reports  just  described,  the  three  are  a  set,  and  are  meant 
to  be  evaluated  together.  Unfortunately,  this  is  difficult  for  the  same 
reasons. 


14.6.7.4.2.6  Time  in  Subdispatch  Queue  Waiting  Service.  This  report  is 
shown  in  figure  14-20,  and  provides  a  measure  of  the  amount  of  time  spent  by 
a  subsystem  (Sub-dispatch  entry)  in  the  Fault  queue  waiting  for  TSS  action. 
Note  that  the  anomaly  discovered  in  section  14.6.7.4.2.2  does  not  appear 
here.  This  would  indicate  that  TSS  executive  processing  is  not  a  factor  in 
the  delay  incurred. 

Comparisons  of  the  time  spent  waiting  TSS  executive  action  cannot  be  made 
against  any  other  measure  or  report  currently  reported  by  TEARS,  but  should 
be  made  across  time.  If,  for  example,  a  time  period  is  discovered  during 
which  the  average  time  a  subsystem  had  to  wait  for  executive  service 
increased,  analysts  could  search  for  some  factor  which  is  interfering  with 
processor  dispatches  to  the  executive.  Finally,  thi3  report  should  be  used 
to  discover  whether  at  any  time  during  the  tracing  interval,  the  total  wait 
time  exceeded  the  length  of  a  TSS  priority  ’B'  dispatch. 

The  overall  wait  time  for  servicing  a  Fault  queue  entry  can  be  affected  by 
increasing  the  priority  'B'  parameters  of  TSS,  as  described  in  section 
14.6.7.3. 


14.6.7.4.2.7  Processor  Time  In  Subdispatch.  This  report  is  shown  in  figure 
14-21,  and  is  the  second  subset  of  the  TOTAL  report  described  in  section 
14.6.7.4.2.5.  The  time  interval  shown  is  that  spent  by  the  subsystem  from 
the  time  it  was  placed  in  the  Sub-dispatch  Ready  queue,  until  it  was 
retrieved  by  TSS  from  the  Fault  queue. 

In  the  example  3hown,  note  the  static  nature  of  this  average.  Apparently, 
little  else  was  going  on  in  the  system  at  the  time.  The  average  time  spent 
by  an  entry  in  the  Ready  queue  did  not  vary  with  the  number  of  users  in 
execution,  as  would  have  been  the  case  if  .MDISP  had  outstanding  batch  work. 

14.6.7.4.2.8  CPU  Monitor  Driven  Reporting.  These  reports,  which  are 
produced  during  the  response  phase  of  the  TEARS  System,  have  not  been 
evaluated  in  this  report,  as  their  generation  is  triggered  by  trace  records 
generated  by  the  GMF/CPUM,  not  the  TSSM. 

14.6.7.4.2.9  TSS  Subtraces  Encountered  Report.  The  usefulness  of  this 
report,  shown  in  figure  14-22  is  in  the  types  of  events  which  are  reported. 
Along  with  the  first  RESPONSE  phase  report,  this  can  be  a  useful  indication 
of  problems  which  might  not  be  even  implied  by  earlier  reports. 

The  following  events  signal  some  problem  which  is  affecting  response  time, 
and  which  should  not  be  present  in  a  healthy  system. 
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Processor  Time  In  Subdispatch  (Part  1  of  2) 
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Figure  14-22.  (Part  2  of  2) 
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Trace  type  3.  PAT  denial.  This  is  caused  by  a  lack  of  PAT  space  in  TSS 
slave  service  areas.  The  occurrence  of  this  type  of  trace  means  that  too 
few  SSAs  are  being  reserved  by  TSS  because  of  either  disk  fragmentation, 
heavy  user  load,  or  some  anomaly  in  the  number  of  files  used  by  TSS 
terminals.  In  any  case,  the  solution  lies  in  raising  the  number  of  SSAs 
which  are  reserved  by  TSS.  Section  14.6.7.3  describes  a  patch  to  location 
136  of  TSSA  which  should  be  used  to  prevent  this  trace  from  occurring. 

Trace  type  31.  This  trace  is  generated  when  a  user  subsystem  is  delayed  two 
seconds  due  to  SMC  contention  problems.  The  cataloged  permanent  disc  files 
are  stored  under  USERIDS  which  are  divided  into  SMC  sections.  When  one  user 
accesses  an  SMC  section,  the  software  will  momentarily  lock  it.  If  this 
trace  type  occurs  more  than  a  few  times,  the  user  affected  will  begin  to 
note  a  delay  in  his  processing.  In  these  cases,  analysis  is  indicated  to 
find  how  this  contention  can  be  minimized  or  prevented. 

Trace  type  93.  This  trace  type  should  not  occur.  If  it  is  discovered,  a 
hardware  or  GRTS  problem  is  indicated  which  should  be  reported  to  field 
engineering  personnel. 

Trace  type  83.  This  trace  type  indicates  an  overload  in  the  subsystem 
memory  area  which  is  delaying  VIP  terminal  output  from  being  produced.  If 
discovered,  either  the  minimum  size  for  TSS  should  be  modified  upwards,  or 
the  maximum  number  of  VIP  terminals  allowed  should  be  decreased. 

Trace  types  65,  66,  67,  68,  69,  70,  72,  and  73.  High  values  for  these  trace 
types  indicate  that  the  subsystem  memory  area  is  facing  very  high 
contention,  causing  urgent  (overdue)  memory  allocations,  force  swaps,  and 
TSS  size  increases.  By  raising  the  maximum  (and  minimum)  TSS  sizeB,  and  by 
lowering  the  'large'  subsystem  definition,  the  occurrence  of  these  trace 
types  should  be  minimized. 

In  the  example  shown,  note  that  a  PAT  denial  occurred  44  times  during  the 
tracing  interval.  An  EBM  buffer  for  VIP  output  was  refused  19  times.  TSS 
considered  a  size  increase  59  times,  (trace  type  65),  and  initiated  such  an 
increase  12  times  (trace  type  66).  93  urgent  allocations  occurred,  (trace 

type  67,  68),  and  784  force  swaps  were  made  (trace  type  69). 

14.6.7.4.2.10  Derails  Report.  This  final  report  produced  during  the 
RESPONSE  phase  of  TEARS  processing,  shown  in  figure  14-23,  is  a  statistical 
analysis  of  the  number  of  times  each  Derail  type  is  executed.  It  is  of  an 
academic  interest  only,  and  provides  little  information  useful  for  analysis 
of  response  time  problems. 

14.6.7.4.3  Analysis  of  TEARS  Emulation  Phase  Reports.  Once  the  TEARS  user 
has  analysed  the  reports  produced  during  the  RESPONSE  phase  of  reduction, 
and  identified  the  periods  in  which  he  has  interest,  he  should  produce  the 
EMULATION  reports  for  these  periods.  The  fields  of  interest  within  the 
reports  produced  during  this  second  phase  of  data  reduction  are  described  in 
this  section. 
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Figure  14-23.  (Part  2  of  2) 
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14.6.7.4.3.1  Exception  Message  Report.  The  first  report  produced  during 
the  EMULATION  phase  is  an  index  similar  to  the  event  log  described  earlier. 
Figure  14-24  illustrates  how  this  report  can  display  the  problem  discovered 
through  examination  of  the  RESPONSE  phase  reports  in  this  section.  Note 
that  tenainal  TM  does  have  a  heavily  processor  bound  subsystem  in  memory, 
but  no  other  exceptions  occurred.  Several  other  types  of  exception  can  be 
displayed  by  this  report,  and  each  is  described  below. 

SWAP  SPACE  DENIED.  The  TSS  swap  files  are  full,  or  not  enough  room  remains 
on  any  one  of  them  to  hold  the  subsystem  for  this  user.  This  may  be  due  to 
the  size  of  the  subsystem  being  swapped  (or,  more  frequently,  pushed  down). 
The  inability  to  swap  this  user  subsystem  is  not  a  denial  to  this  user,  who 
will  continue  to  enjoy  processor  attention  if  required,  but  will  affect 
other  memory  allocation  requests.  In  any  case,  the  resolution  involves 
raising  the  maximum  size  of  the  TSS  swap  files,  as  described  in  section 
14.6.7.3. 

JOUT  OUTPUT  BUSY.  Self  explanatoiy. 

CANNOT  SWAP.  Insufficient  room  on  the  device  containing  one  of  the  swap 
files  for  TSS  remains  to  allow  that  file  to  grow.  Resolution  involves 
somehow  depopulating  that  device,  or  allowing  the  length  of  disk  extents  to 
be  raised  by  recovering  disk  fragments  by  performing  a  cold  boot. 

I/O  ERROR  ON  DEVICE.  Self  Explanatory.  Resolution  involves  reformatting 
the  device  during  a  cold  boot. 

UNDEFINED  DEVICE.  This  occurs  when  users  attempt  to  restore  files  from 
systems  having  devices  which  are  not  present.  Privileged  FMS  directives  can 
be  used  to  delete  the  file  reference. 

SEEK  ERROR.  Self  Explanatory.  This  type  of  problem  may  indicate  software, 
hardware,  or  firmware  errors. 

FAILURE  IN  NAME  SCAN.  Program  stack  errors. 

BAD  SPACE  INVENTORY.  Device  error. 

BAD  SPACE  TABLE.  Device  error. 

UNACCOUNTABLE  ERROR  1. 

CHECKSUM  ERROR.  Device  or  I/O  error. 

PAGE  REQUEST  DEADLOCK.  IDS  error. 

UNACCOUNTABLE  ERROR  2. 

BAD  PRIMITIVE.  GCOS  problem,  indicates  some  master  mode  code  destroyed. 
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Figure  14-24.  Exception  Message  Report 


14-94 


CH-7 


n  (tr*»i  tn'ti'itie 


BAD  PROGRAM  DESCRIPTOR.  GCOS  problem. 

LOOP  IN  PRIMITIVES.  GCOS  problem. 

PROGRAM  TOO  LARGE  TO  SWAP.  The  subsystem  memory  area  assigned  to  this  user 
is  too  large  to  fit  on  any  of  the  TSS  swap  files.  To  resolve,  increase  the 
minimum  and  maximum  sizes  for  these  files. 

BAD  STATUS-SWAP  OUT.  I/O  error. 

BAD  STATUS  -  SWAP  IN.  I/O  error. 

BAD  STATUS  -  LOAD.  I/O  error  on  TSS  program  file  containing  subsystem  being 
loaded. 

ERROR  IN  WRITING  TAP*.  During  TAPE  mode  processing  for  the  user. 

NOT  ENOUGH  CORE  TO  RUN  PROGRAM.  Produced  after  a  settable  interval  has 
passed  to  prevent  memory  allocations  from  becoming  urgent  several  times. 

OUT  OF  SWAP  SPACE.  TSS  has  gr&wn  all  of  its  swap  files  to  the  currently  set 
maximum  size. 

SY**  I/O  ERROR  I/O  error  on  user  current  file. 

DRL  DEFIL  -  NO  FILE  SPACE.  Self  explanatory. 

DRL  DEFIL  -  NO  PAT  SPACE.  TSS  has  filled  its  SSA  space,  and  cannot  define 
another  file.  Raise  the  number  of  SSAs  to  retain  during  startup. 

DRL  MORLNK  -  NO  FILE  SPACE.  Self  explanatory. 

DRL  MORLNK  -  PAT  FULL.  SSA  space  full. 

DRL  JOUT  -  BATCH  SYSTEM  FULL.  No  entries  are  available  in  GEOT  queue. 

Transient  problem  not  related  to  TSS. 

DRL  JOUT  -  LOST  PAT.  GCOS  error. 

DRL  SPAWN  -  NO  PAT.  TSS  SSA  space  full. 

DRL  SPAWN  -  SCHEDULER  QUEUE  FULL.  Self  explanatory.  Transient  problem. 

DRL  TASK  -  NO  HIOGRAM  NO.  All  program  numbers  are  in  use  in  the  system. 

DRL  TASK  -  BAD  *J  STATUS.  I/O  or  device  error. 

DRL  T. SYOT  -  SYSTEM  LOADED.  GEOT  queue  full.  Transient  error.  * 
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DHL  T.SYOT  -  BACKDOOR  PILE  NOT  CONFIGD.  Subsystem  software  error. 

GEMORE  DENIED.  Insufficient  core  memory  for  growth  or  impossible  to  release 
memory  at  this  time.  Transient  problem. 

CPU  HOG?  CPS  SINCE  SWAPIN  ■  nnnnnn.  This  exception  is  produced  when  a 
subsystem  seems  to  be  processor  bound.  In  the  example  shown  in  figure 
14-24»  terminal  TM  is  executing  a  subsystem  which  is  in  this  state.  The 
periods  to  the  right  of  this  exception  message  are  repetition  factors  which 
are  used  in  place  of  displaying  this  message  more  than  once  each  second. 

14.6.7.4.3.2  USERS  Map  Report.  The  reader  is  referred  to  the  GKF  Users 
manual  for  an  explanation  of  this  report ,  which  cannot  be  used  to  analyse  or 
identify  response  time  problems. 

14.6.7.4.3.3  TSS  Core  Map  Report.  The  reader  is  referred  to  the  GMP  Users 
Manual  for  an  explanation  of  this  report,  which  is  also  not  useful  in  the 
identification  or  analysis  or  response  time  problems. 

14.6.7.4.3.4  ERROR  MESSAGE  Report.  This  report  iB  produced  when  errors  in 
program  logic  prevent  analysis  of  trace  data  collected  by  the  GMP/TSSM.  The 
EMULATION  phase  of  TEARS  is  a  primitive  model  of  the  ways  in  which  TSS 
itself  works.  When  the  actual  operation  of  the  system  differs  from  the 
model  (and  if  this  difference  is  noticed  by  TEARS)  an  error  message  is 
produced. 

This  report,  as  with  the  previous  two  reports,  is  not  useful  to  analysts 
interested  in  the  identification  or  analysis  of  response  time  problems. 

14.6.7.4.3.5  INTERTRACE  Reports.  The  purpose  of  these  reports  is  to 
somehow  track  periods  when  the  intervals  between  the  occurrences  of  GMP/TSSM 
traces  (i.e.,  their  generation)  seems  to  be  large.  A  large  (or  long) 
interval  between  the  occurrence  of  GMF/TSSM  traces  would  imply  that 
something  is  preventing  the  execution  of  the  TSS  executive.  If  the  times 
when  these  intervals  are  large  could  be  trapped  and  reported,  some 
investigation  could  be  performed  to  identify  why  the  executive  was  not  being 
dispatched  to. 

The  implementation  of  this  concept  has  been  made  as  a  function  of  the  time 
delays  between  GMP/TSSM  traces  (as  determined  by  the  time  stamp  included 
with  each  trace  record),  with  corrections  made  for  periods  when  TSS 
relinquishes  the  processor  voluntarily  (OEWAKE).  The  concept  of  this  type 
of  reporting  is  valid,  but  cannot  be  collected  from  the  TSS  executive.  This 
report  is  being  examined  to  improve  its  usability. 

14.6.7.4.3.6  Memory  Activity  Report.  This  report  can  be  used  to  indicate 
the  frequency  and  times  during  which  activities  occurred  which  are  related 
to  subsystem  memory  management.  By  documentation,  this  report  should  be 
produced  each  time  an  exception  in  processing  is  noted  (section 
14.6.7.4.3.1).  Examination  of  the  reports  produced  by  the  RESPONSE  phase 
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again  indicate  discrepancies  with  this  report.  These  descrepancies  are 
being  investigated. 

14.6.7.4.3.7  SUBDISPATCH  Reports.  This  report  is  currently  in  error. 

14.6.7.4.3.8  Users  Swaps  (Swap  Rate)  Report.  This  report  is  not  useful  in 
the  identification  or  analysis  of  response  time  problems,  and  the  user  is 
referred  to  the  GMF  Users  Manual  for  an  explanation  of  its  purpose. 

14.6.7.4.3.9  User  Swaps  (Swap  Amount).  An  example  of  this  report  is  shown 
in  figure  14-25.  This  is  the  best  report  currently  available  which  can  be 
used  to  identify  the  sizes  of  the  subsystems  which  have  become  candidates 
for  swap. 


14.6.7.4.3.10  User  Swaps  (Duration)  Report.  The  User  Swaps  (Duration) 
report  describes  the  length  of  subsystem  swaps,  and  is  illustrated  in  figure 
14-26.  The  user  should  examine  the  content  of  this  report  to  gain  an 
understanding  of  the  volume  of  allocation  work  being  faced  by  the  system,  as 
this  is  the  factor  which  will  influence  the  duration  swaps  for  subsystems. 

14.6.7.4.3.11  User  I/Os  -  Duration  Report.  This  report  provides  a  measure 
of  the  time  spent  by  users  in  a  relinquished  state  pending  disk  I/O 
operations.  There  are  no  Tunable  parameters  associated  with  I/O  operations 
which  can  affect  response  times.  In  discovering  the  source  of  a  delay  for  a 
particular  user,  this  report  might  become  useful  in  the  rare  situation  where 
an  I/O  operation  was  delayed  for  some  reason.  Figure  14-27  contains  an 
example  of  this  report. 

The  only  conceivable  way  in  which  an  I/O  operation  might  be  delayed  for  a 
significant  period  is  if  the  hardware  itself  were  at  fault.  If  this  were 
the  case,  alert  messages  would  be  placed  on  the  system  console. 

14.6.7.4.3.12  In  SMC  Wait  (Duration).  As  described  earlier,  the  occurrence 
of  the  GMF/TSSM  traces  associated  with  SMC  waiting  are  symptomatic  of  a 
contention  between  some  TSS  user  and  another  process  using  the  same  SMC 
section.  Figure  14-28  contains  an  example  of  this  report,  which  can  be 
useful  to  identify  the  extent  of  the  delay  incurred.  Resolution  of  the 
problem  indicated  by  these  traces  will  involve  some  modification  to  the  SMC 
section  assignment  of  the  users  affected. 

The  trace  information  collected  by  the  GMF/TSSM  will  be  of  only  marginal 
value  in  the  identification  of  the  processes  which  are  in  contention,  unless 
both  are  TSS  users,  which  is  not  necessarily  the  case. 

14.6.7.4.4  Resolution  of  Response  Time  Problems.  The  sections  above  have 
described  the  indications  of  response  time  problems  which  can  be  discovered 
through  the  review  of  the  reports  produced  by  the  TEARS  system.  In  general, 
the  anomalies  to  the  normally  encountered  values  or  curves  are  the  items 
which  should  prompt  attention  or  focus.  If  the  problem  is  a  continuing  one. 


14-97 


CH-7 


CH-7 


ms*  M»n»v  wo  tioiiio)  xounaiaisi* 


iiti  ms  os  'ooo'i  -  ft  mo  is  via  ‘NOiiinaia  ssi 


K  * 

<  W 

C  «r 


««■>  o 

--  O  MJ  . 

K  >  K  •* 

O  «  — 


15  AM  W  IM 

COOO 
—  OOO 
"ODD 


»>£»«X22L^2i^  **©©e~~~*w***w*>»^»«N^ 

cocoooocoooooooooeooSo 


Figure  14-28.  In  SMC  WAIT  (Duration)  Report 
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different  data  tapes  should  reveal  similar  problems  in  the  same  types  of 
reports. 

i 

The  next  step  in  the  resolution  of  a  response  problem  will  be  a  decision  as 
j  whether  the  problem  requires  action  at  all.  That  is  to  say,  are  there  any 
steps  which  can  be  taken  to  affect  the  delay  which  has  been  discovered?  For 
!  example,  if  a  user  is  being  (or  was)  delayed  due  to  an  unusually  high 
I  workload,  there  is  really  nothing  which  can  be  done  to  improve  the  situation 
i  without  affecting  processing  during  normal  periods.  Another  example  of  this 
type  of  situation  can  be  encountered  by  examining  some  of  the  TEARS  reports 
which  seem  to  indicate  problems  which  are  not  problems  at  all.  Certainly, 

1  the  tunable  parameters  associated  with  TSS  could  be  used  to  affect  the 
i  balance  between  Fault  queue  wait  time  and  Ready  queue  service  time,  but  this 
'solution'  might  cause  more  problems  than  it  solves. 

3efore  beginning  the  modification  of  the  TSS  and  GCOS  software  to  change  the 
values  reported  by  the  itlARS  system  (and,  hopefully,  the  response  time 
1  experienced  by  TSS  users),  site  personnel  should  first  consider  the  cost  of 
the  loss  represented  by  the  response  problem  incurred.  If  it  is  not 
encountered  except  during  unusual  periods,  the  time  and  effort  (in  both 
numan  and  machine  resources)  required  to  prevent  it  may  not  be  justified. 

The  process  of  analysing  and  preventing  or  minimizing  response  time  problems 
I  is  an  expensive  one,  and  should  not  be  lightly  undertaken. 

I 

The  process  involves  the  modification  of  the  tunable  parameters  described  in 
,  section  14.6.7.3  of  this  report.  The  degree  of  modification  required,  and 
the  exact  parameters  which  will  need  to  be  modified  can  be  discovered  only 
;  through  experimentation.  This  experimentation  must,  of  course,  be  carried 
I  out  in  the  same  environment  as  that  under  which  the  problem  was  discovered. 

!  It  is  unlikely  that  a  single  setting  change  will  resolve  a  response 
problem.  More  likely,  a  number  of  settings  will  need  to  be  tried  for 
several  parameters  before  an  optimal  setting  discovered. 

The  next  section  of  this  report  describes,  in  general  terms,  the  process  of 
this  modification,  and  the  types  of  parameters  which  should  be  modified  to 
affect  different  problems. 

14.6.7.5  Analysis  of  Response  Time  Problems.  The  resolution  of  response 
time  problems  through  the  modification  of  the  tunable  parameters  described 
in  section  14.6.7.3  of  this  report  is  possible  if  the  problems  are  related 
to  the  TSS  system  itself.  If  the  problems  are  due  to  user  actions  or  some 
circumstance,  such  as  disk  space  fragmentation,  modification  of  these 
parameters  will  have  little  effect,  and  other  types  of  action  are  indicated. 

The  sections  within  this  final  chapter  are  divided  according  to  the  subject 
areas  covered  in  section  14.6.7.3.  If  modification  of  the  parameters 
described  in  section  14.6.7.3  do  not  affect  the  problem  encountered,  it  is 
probable  that  some  user  or  system  problem  exists.  Successful  completion  of 
these  procedures  will  not  be  possible  if  the  reader  is  not  cognizant  of  the 
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effect  these  modifications  are  having.  Therefore,  all  modifications 
performed  should  be  both  preceded  and  followed  by  executions  of  the  GMF/TSSM 
and  TEARS,  As  more  modifications  are  performed,  and  more  trace  data 
collected,  the  nature  of  the  problem  can  be  isolated  to  a  great  degree.  For 
example,  if  lengthening  the  dispatches  granted  to  the  TSS  executive  does  not 
affect  response  time,  the  problem  will  not  involve  the  Sub-dispatch  Fault 
queue.  It  will,  at  that  point,  not  be  helpful  to  increase  the  frequency  of 
priority  'B*  dispatches  granted  to  TSS. 

14.6.7.5.1  Priority  'B'  Processing  Problems.  The  TSS  executive  relies  on 
.MDISP  to  grant  it  enough  processor  attention  to  completely  service  the 
entries  in  its  Fault  queue  in  a  timely  manner.  If  the  executive  is  not 
getting  sufficient  attention,  the  response  times  for  users  performing 
allocation  processes  and  I/O  operations  will  be  increased.  Also,  the  report 
displaying  the  wait  time  for  Fault  queue  service  will  show  high  values. 

To  resolve  these  problems,  the  number  of  times  that  TSS  is  able  to  service 
its  Fault  queue  must  be  increased.  First,  increase  the  length  of  a  TSS 
executive  dispatch  by  modifying  the  patches  to  .MDISP  and  TSSA  shown  in 
section  14.6.7.3.1.  This  should  lower  the  wait  time  for  Fault  queue 
service,  as  well  as  the  total  average  response  times  experienced.  Note  also 
the  slope  of  the  average  response  times  for  users  with  core  requests,  which 
should  also  be  lowered. 

If  this  modification  does  affect  the  response  time,  but  not  to  the  degree 
desired,  the  user  may  wish  to  increase  the  number  of  times  that  TSS  will  be 
granted  a  priority  ' B'  dispatch,  by  lowering  the  value  of  bits  30-32  of 
location  1  in  .MDISP. 

14.6.7.5.2  TSS  Executive  File  Problems.  If  the  modification  of  TSS 
dispatch  lengths  does  not  affect  the  response  time  for  users  with  core 
requests,  the  fault  may  lie  with  the  TSS  Swap  files.  Verify  that  there  is  a. 
high  degree  of  swapping  being  performed  by  examining  the  TSS  subtraces 
frequency  (section  14.6.7.4.2.9).  Next,  verify  that  the  recommendations 
contained  in  section  14.6.7.3.2  have  been  carried  out,  and  that  the  swap 
files  ar-v  not  contending  with  one  another  by  being  resident  on  the  same 
device.  Before  modifying  any  of  the  TSS  parameters  associated  with  memory 
allocation  or  swap  space  management,  first  implement  these  recommendations 
and  recollect  trace  data. 

If  the  implementation  of  the  suggested  placement  for  TSS  files  does  not 
affect  the  reports  described  above,  the  problem  may  lie  in  the  size  of  the 
files.  Examine  the  trace  frequency  report  (section  14.6.7.4.2.9)  for  trace 
type  2,  which  will  be  produced  if  the  TSS  swap  files  are  too  small.  If 
encountered,  increase  the  size  for  each  of  the  active  swap  files  and 
recollect  trace  data.  If  r.ot  encountered,  examine  the  counts  for  the  traces 
associated  with  urgent  user  processing.  These  traces  will  indicate  a 
problem  with  subsystem  memory  management  rather  that  the  swap  files 
themselves.  Finally,  ensure  that  the  TSS  program  files  are  not  on  the  same 
device  by  examining  the  system  boot  deck. 
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14.6.7.5.3  TSS  Swap  File  Processing  Problems.  The  symptoms  of  problems 
with  the  algorithms  which  control  the  swapping  of  user  subsystems  are: 
numerous  swaps,  force  swapping,  urgent  user  allocations,  as  well  as  the 
occurrence  of  those  GMF/TSSM  traces  which  record  TSS  size  changes.  These 
problems  are  caused  by  contention  for  a  limited  amount  of  core  storage.  The 
user  should  first  examine  the  size  limitations  placed  on  TSS  before 
performing  any  modifications  to  the  swapping  algorithms. 

The  amount  of  core  storage  which  can  be  given  to  TSS  is,  of  course,  limited 
by  the  amount  of  core  available  to  the  system  as  well  as  that  required  by 
other  processes.  If  the  loading  of  the  system  under  examination  precludes 
resolution  of  swap  file  processing  problems  through  expansion  of  the 
executive  core  space,  the  analyst  may  begin  the  optimization  process  by 
modifying  the  definitions  of  large  subsystems  through  the  directions  shown 
in  section  14.6.7.3.7.2  This  modification  will  limit  the  number  of  times 
that  force  swaps  occur  due  to  urgent  allocation  processes.  This  should  be 
followed  by  a  recollection  of  trace  data,  as  problems  with  the  files 
themselves  is  much  less  common  than  high  contention  among  user  subsystems 
for  limited  core  space. 

Examine  the  number  of  occurrences  of  the  exception  message  (section 
14.6.7.4.3.1)  SWAP  SPACE  DENIED.  Any  occurrence  of  this  message  should 
cause  modifications  to  the  minimum  sizes,  and  amount  to  grow,  for  these  swap 
files  (section  14.6.7.3.3).  Increase  these  values  and  recollect  trace 
information  until  this  exception  fails  to  occur. 

14.6.7.5.4  Subsystem  Accounting  Problems.  To  evaluate  the  effect  of 
subsystem  accounting  on  the  response  times  being  experienced,  disable  this 
feature  as  shown  in  section  14.6.7.3.4,  and  recollect  the  trace  data. 

Examine  each  of  the  reports  for  any  difference.  Little  will  be  noted. 

Unless  some  GCOS  system  problem  is  being  encountered,  this  processing  will 
have  little  effect.  Unless  the  data  collected  by  subsystem  accounting  is 
required  by  the  site,  however,  the  overhead  represented  by  this  processing 
should  be  turned  off. 

14.6.7.5.5  UST  Memory  Management  Problems.  The  only  problem  which  is 
likely  to  be  encountered  related  to  UST  management  involves  the  delay 
imposed  before  the  release  of  abnormally  disconnected  UST  entries. 

Examine  the  subtraces  encountered  report  (section  14.6.7.4.2.9)  for  the 
following  trace  types: 

22  -  Place  User  in  Reconnect  Mode  (Abnormal  disconnect  has  just  occurred) 
71  -  UST  area  increase  by  IK 
97  -  UST  compressions 
99  -  UST  area  decrease  by  IK 

The  count  for  trace  type  71  will  give  an  indication  of  how  often  these 
abnormal  disconnects  occur.  Trace  types  71,  97,  and  99  will  give  an 
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indication  of  how  dynamic  the  load  faced  by  TSS  ia  during  a  tracing  period. 
Examine  the  event  log  and  the  reaponae  timea  reporta  to  gain  an 
understanding  of  the  number  of  timea  the  number  of  active  users  approaches 
the  maximum  (section  14.6.7.3.5).  Finally,  look  for  the  number  of  reconnect 
USTs  which  are  eventually  recovered.  The  reconnect  will  cause  the 
generation  of  trace  type  58. 

Typically,  the  number  of  USTs  which  are  placed  in  reconnect  mode  is  much 
larger  than  the  number  which  are  eventually  reconnected  to.  If  the  system 
is  extremely  dynamic  (traces  71,  97,  99),  or  if  the  number  of  active  users 
approaches  the  maximum,  the  timer  for  these  reconnect  USTs  should  be 
lowered,  causing  them  to  become  available  sooner.  The  damper  on  DHUN 
sessions  can  also  be  used  to  lower  the  frequency  of  these  trace  types  at 
sites  which  employ  this  deferred  command  file  processing  facility. 

14.6.7.5.6  Subsystem  Memory  Management  Problems 

Due  to  the  limited  nature  of  core  storage  which  is  available  to  TSS,  and  the 
large  size  of  the  subsystems  which  are  usually  requested  by  TSS  users,  the 
contention  for  space  in  the  subsystem  memory  area  is  usually  high.  To 
manage  this  space,  TSS  uses  algorithms  which  tend  to  discriminate  against 
large  memory  allocation  requests.  The  degree  to  which  TSS  will  discriminate 
against  these  systems  is  adjustable,  and  will  provide  the  best  mechanism  to 
lower  the  response  times  experienced  by  users  who  require  memory  allocation 
(section  14.6.7.4.2.4).  The  first  parameter  which  should  be  changed  is  the 
size  definition  for  a  large  subsystem.  The  optimal  setting  for  this 
parameter  is  twice  the  size  of  average  subsystem  size  (computed  based  on 
usage).  Unfortunately,  this  is  not  reported  by  the  TEARS  system. 
Experimentation  will  reveal  that  20K  is  a  good  value  for  this  cell. 

If  modification  of  the  definition  of  a  large  subsystem  does  not  yield  the 
desired  results  against  the  wait  times  experienced  by  users  requiring  memory 
allocation,  the  next  parameter  to  modify  will  be  the  wait  time  multiplier, 
which  will  cause  a  large  memory  allocation  to  wait  longer  before  causing 
swaps  to  occur. 

Ironically,  the  suggested  modifications  to  decrease  the  wait  time  for  memory 
allocations  will  cause  increases  in  certain  allocation  times.  The  effect 
is,  however,  positive.  By  discouraging  large  subsystem  allocations,  smaller 
subsystems  are  able  to  load  faster,  and  terminate,  leaving  space  open  to 
complete  the  large  allocation.  This  principle  has  been  employed  on  heavily 
loaded  systems  with  very  positive  results. 

The  other  factor  which  will  affect  subsystem  memory  allocations,  and  hence, 
response  time,  involves  the  growth  of  the  TSS  executive  within  the  bounds 
established  by  site  options.  The  amount  of  core  which  will  be  requested  by 
the  executive  should  be  adjusted  based  on  the  occurrence  of  trace  types  72 
and  74  (GEMORE  memory,  release  memory).  While  performing  a  GEMORE,  TSS  is 
vulnerable,  and  is  unable  to  service  its  Fault  queue,  causing  delays  for 
users  who  do  not  require  memory. 
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By  increasing  the  amount  which  will  be  requested  on  each  G04ORE,  the  number 
of  these  size  increases  can  be  minimized.  When  performing  this  type  of 
modification,  corresponding  changes  should  be  made  in  the  size  of  a  memory 
reduction.  The  amount  of  core  to  release  should  be  smaller,  but  an  even 
divisor  of  the  amount  of  core  which  is  attached.  Thus,  TSS  will  quickly  get 
more  core,  but  will  give  it  back  slower.  These  patches  are  described  in 
section  14.6.7.3.7. 

14.6.7.5.7  TSS  Executive  Processing  Problems.  Trace  type  3s  (section 
14.6.7.4.2.9)  should  cause  modifications  to  the  number  of  SSAs  reserved  by 
TSS  during  its  initialization  processing,  according  to  the  patches  described 
in  section  14.6.7.3.7. 
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